Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / Java

DataSet - A Polymorph Collection

4.70/5 (13 votes)
10 Mar 2015CPOL4 min read 30.4K   5  
A set of classes providing a polymorphic data structure.

Introduction 

The DataSet is a simple class that holds items of any type. I initially used it for providing data to the client layer, encapsulating the server side functionality by only exposing data. This then got expanded with the ConfigData, a wrapped DataSet that provides configuration data and enforces reading all the items. 

This article describes the package of classes I use to support the DataSet. On its own, this isn't much, but I realised that a lot of other subjects I wanted to post articles on required this as it is my de-facto structure for throwing data around.

This can be used in place of JSON and I intend to have converters working reliably soon.

Background 

It is often the case that a class has no function, just value. Well over time, I grew tired of writing myriad data classes with nothing but accessor methods. So based on a design I had used in various jobs, I went about building my own. 

Base Classes

The structure is simple as can be. A DataSet contains a set of DataItem objects. Each item takes a name/value pair and once created is immutable. As a result of getting stung around trying to use generics, I shifted the design so that the item holds an Object and the class itself ensures safe casting, rather than having to deal with it at a higher level.

In the DataSet, items can be retrieved as an object or the value accessed directly; this saves on a lot of null checks and method chaining.

The 'native' types that can be safely called from the DataSet and DataItem are all listed in enum ValueType. All other types are treated as an Object and the consumer is responsible for managing them safely. 

There are then two wrapper classes. The ConfigData is used for loading configuration from a file for an application and SealedDataSet provides read only access to a DataSet

Transport Layer 

The two major uses I have for the DataSet are communication and persistence. For this I provide two classes that read and write the data. 

DataReader is used to take an inbound String from a BufferedReader and turn it into a DataSet. The reverse is performed by DataWriter,  turning a DataSet into a String and writing it a  BufferedWritter. The format of these strings is nice and simple:

An example of the format, taken from the JavaDoc, is:

# comments can be added at any point and each
# item is written thus:
# name t value
example {
# Boolean -
   isMale ? TRUE
# DataSet -
   subBlock {
      item - simple text.
   } 
# Date -
   dateWriten @ 2013-02-27 12:00:25.3789Z
# Double -
   weight $ 75.3
# Integer -
   age % 42
# Long -
   big = 123456
# String [without reserved characters] -
   name - William Norman-Walker
# String [with reserved chracters] -
   longText \
\# this will all be read as\
it \@ contains all the special chracters \\ escaped\
\{ so the parser can read them \}.\
all for \$15.00 \@ \-50\% discount
# An array can be a list of any supported objects
 array [
  - each element in an array
  - will appear on a different line
  ! 123
  - data types can be mixed
 ]
} 

It is a bit like a few other formats I know; but that's how it evolved. For simple config files, the type character can be omitted and is treated as a string. The extended format is actually overkill, and simply using a backslash on the line terminator is enough for it to be parsed cleanly. 

Using the code

So how do I use it? The first way is for loading configuration options. I store the config for an app in a data file and then load it in at start up:

Java
/**
 * Standard Entry point.
 * @param args
 */
public static void main (String[] args) {
    DataReader reader = null;
    try {
        reader = new DataReader(new File("demmo.config"));
        ConfigData config = new ConfigData(reader.read());
        App app = new Appendable(config);
        config.close();
    } catch (DataException ex) {
        System.err.print(ex);
    } catch (java.io.IOException ex) {
        System.err.print(ex);
    } finally {
        if (reader != null) {
            try {
                reader.close();
            } catch (IOException ex) {
                System.err.print(ex);
            }
        }
    }
} 

I am not a great fan of throwing exceptions, but here I think it does make sense. For consideration would be a constructor for ConfigData that takes a file and deals with all the underlying problems.

Arrays

It became apparent I needed arrays. As part of using a MongoDb back-end, I found so many problems moving between JSON and the safer DataSet, that I decided to bite the bullet.

To add an array as an item, it first needs to be placed in a ValueArray, this makes sure the data is managed properly.

The values in the array must conform to normal value types, and can themselves be arrays, but at present NULLS are not supported.  This may change in the future.

To add an array, the code is nice and simple:

 
Java
// Add the array:
int[] arr = {1, 2, 3};
data.put("first", new ValueArray(arr));

ValueArray va = new ValueArray();
va.add("X");
va.add(42);
data.put("second", va);
// or add to an exisiting one
data.getArray("first").add("Something");

Chaining

As part of adding support for arrays, I have introduced the concept of chaining. Any add to the structure will result in the component being added to being returned, this allows the use of chained commands. A chained of commands means we can replace:

Java
DataSet data = new DataSet();
data.put("A", 1);
ValueArray vaB = new ValueArray();
vaB.add("One");
vaB.add(2);
vaB.add(9876543210L);
vaB.add(3.4);
vaB.add(false);
ValueArray innerArray = new ValueArray();
innerArray.add(1);
innerArray.add(2);
vaB.add(innerArray);
DataSet innerData = new DataSet();
innerData.put("key","value");
vaB.add(innerData);
vaB.add(new java.util.Date())
data.put("B", vaB);
DataSet subset = new DataSet();
subset.put("D", null)
subset.put("E", "1\\2#3-4\"5{6}7?8$9@a%b\nc")
subset.put("F", "So-long, farewell Adure!")
data.put("C", subset);

With this:

Java
DataSet data = new DataSet()
	.put("A", 1)
	.put("B", new ValueArray()
		.add("One")
		.add(2)
		.add(9876543210L)
		.add(3.4)
		.add(false)
		.add(new ValueArray(1,2))
		.add(new DataSet().put("key","value"))
		.add(new java.util.Date())
	)
	.put("C", new DataSet()
		.put("D", null)
		.put("E", "1\\2#3-4\"5{6}7?8$9@a%b\nc")
		.put("F", "So-long, farewell Adure!")
	);

I know which I prefer!

Points of Interest

I use these classes in a lot in the stuff I write. The purpose of this article is that I will not have to re-visit this in future. For those interested, the next article should be on an expression evaluator that uses a DataSet to find variable names. 

History 

  • 2013-03 - Initial submission to CodeProject.
  • 2013-03-13 - Updated code, bug fix in toString() of DataItem
  • 2013-07-11 - Updated code, bug fix in get(key) in SealedDataItem
  • 2015-03-10 - Added support for an array of values with requisite refactoring.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)