Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C

Give JSON and XML Serialization to C++ Yet Existing Code

4.88/5 (5 votes)
26 May 2019CPOL8 min read 9.7K   123  
This code allows dumping and retrieval from a single variable to a complete tree of objects using both JSON and XML, in an unintrusive way, using tentative templates.

Main Goal

  1. Keeps persistent variable and object values between executions, with a few lines of code, and lets the templates do all the work.
  2. Also stores and restores whole variable object trees with almost no impact on the existing code.
  3. Uses known formats (actually JSON and XML) but is easily extendable.
  4. Allows incomplete (or partially lost) config files.
  5. Knows about the strengths (especially tentative templates) and weaknesses of the C++ standard, and figures out why we had Java, and not interpreted C++.

A Tiny Example

This tiny piece of code (01-counter.cc) shows how, using templates and overload, in C++, we can exceed goals with an include and one extra line of code.

C++
#include "keepconf.h"       /// 1) Include the required templates

 int counter;

//KEEPJSN( counter );       /// 2) This does the job ( JSON version )
KEEPXML( counter );         /// 2) This does the job ( XML version )


int main( int argc, char ** argv )
{ printf( "%s has been executed %d times"
        , *argv, counter );

  counter++;                // Increase executions

  return( counter );
}

On first execution, we get the file counter.jsn:

Java
{ "counter":  0
}

On the XML version counter.xml:

XML
<?xml version="1.0" ?>
<config style="rich">
<counter int="0"/>

The selected file will reflect the number of executions. It's weird, because it looks more like a variable declaration than code "per se" but it does the job in a "low entropy" way. As you may notice, the XML version has type information. This will be important in later examples.

A Bunch of Variables

This is the second example (02-bunch.cc). Only slightly more complex and allows several variables.

C++
KEEPXMLLIST( bunch )  // Comment out that and comment in next line and for jsn version

// KEEPJSNLIST( bunch )
{ KEEPGLOB( counter );
  KEEPGLOB( message );
  KEEPGLOB( flag    );
};

To share the same config file:

JavaScript
{ "bunch":
  { "counter":  1
  , "message": "hello"
  , "flag":  123
  }
}

Or the XML version:

XML
<?xml version="1.0" ?>
<config style="rich">
<bunch class="TYPEbunch">
  <counter int="1"/>
  <message str="hello"/>
  <flag byte="123"/>
</bunch>

You can delete a variable on both JSON or XML files, and the next time you run the program, deleted lines will be recreated with the default values, so we have partial defaults capability.

Now, an Object

The real utility of this set of templates is to give serialization to objects in a unintrusive way. For example, for a not-yet-coded object called PersintentExample, this declaration will give persistence to its instances (objectExample). We need no modification to the object itself, only to declare the serializer. KEEPITEM must be done for each wanted desired persisting variable in the object.

C++
// Serializer declaration
//
KEEP_LOADER( PersintentExample )
{ KEEPITEM( aInteger      );
  KEEPITEM( aByte         );
  KEEPITEM( ourString     );
  KEEPITEM( anyType       );
  KEEPITEM( uninitialized );
}

KEEPXML( objectExample );    // This does the job ( xml version )
//KEEPJSN( objectExample );  // This does the job ( json version )

Experience tells you that you may need to initialize the object once loaded from disk. You can do this by declaring this outside the object:

C++
/**
 *   A builder for the loaded objects can be added,
 * but is optional
 */
void buildObject( PersintentExample & obj )
{ obj.doesNotMatter= 5;
  fprintf( stderr
         , "#\n"
           "# %s has been built\n"
           "#\n\n"
         , typeId( obj ));
}

Here, the concept of tentative templates arises. You can simply not implement buildObject(PersintentExample & obj) and no code will be executed and no error will be generated. The only possible alteration of the original object is to declare this function friend, for permissions. Run the program 03-object and take a look at objectExample.xml (or objectExample.jsn) to see the results and play with them.

Created "on the fly"

KEEPXML (and KEEPJSN) makes an "implicit" serialization of an object, but we can also use SAVEXLM and LOADXML (and their JSON version) for doing this persistence "by hand" 04-new.cc shows how to do this. This is another important capability of this code, it not only fills in yet allocated variables, but also creates them like an "object factory". This example shows how to deal with this, taking different actions depending on the object existence on the disk file.

An Arbitrary List of Objects

This is a little upgrade of the former example. 05-list.cc is able to "tree serialize" a collection of objects, creating and filling its components from disk. Next, we will discuss how to reach the next object (done at serialize out time) and how to add the created object to the collection. This is done by an external function to the object nextObject. It is used for both report next, (when toAdd is NULL) and add to collection functions ( when toAdd is !NULL). I've used the same function to keep the interface simple.

C++
friend Aemet * nextObject( Aemet * hld
                         , Aemet * toAdd ) ;

list.xml (or list.jsn) shows the result serialized tree. You can try to change it, and see the results in the main program.

An Arbitrary List of Arbitrary Objects

This is, by far, the most advanced example. It allows you to have a tree of arbitrary objects stored on disk. The way to add this capability over existing objects, is to tell how to reach the first and the next object in the list, the saver uses this to walk the tree. This is used on walking the list to store it.

C++
friend type * type::nextObject(  type * hld ) { return( hld->next ); }
friend type * type::firstObject(  type * hld ) { return( hld       ); }

And how to add a recently loaded object to the list (used when loading from disk):

C++
void type::linkObject( type * hld )
{ hld->next= next; next= hld;
}

Although this code relies only on templates and no virtualization is used, the runtime must know about the loaded object. In this case, of the arbitrary list, so the implied objects must be virtual. Declaring linkObject virtual may be enough for this, but is not necessary yet in the case of virtual objects.

Originally created and tested, this code on the template version of our good friend TUI, for streaming the window and widget list to disk. Here, you have a screenshot:

Image 1

The very polymorphic items on screen are stored this way in the template code port. This is made possible by the deletion of thousands of lines of code on the original code.

JSON is much simpler than XML. It knows about variable names, but not about type names and array indexes. This leaves you out of combat in this example, at least, in the form of the basic specification.

Also, not all compilers fully expose the full virtual object list. This made MSC++ unable to manage this example. GCC only exposes this list partially, so an additional job to every object you want made visible is to register it on an parallel virtual object list. For example:

C++
// Register the class ListRec2 to allow its use and tell how to save its members

REGISTERCLASS( ListRec2 )
{ KEEPROOT( Common );

  KEEPITEM( str2 );
  KEEPITEM( ownInteger );
}

Finally, An Array

07-array.cc keeps a simple array of objects. I've included this to spot another weakness, in my opinion, of JSON. If you delete, lets say, the 3ยบ element of the array on the XML version, the system correctly will recreate and assign the missed element, but JSON has no index information, so the array will be incorrectly created.

How to Test

The source package bundles a build system for autotools (./configure & make) codeblocks (cblocks) and Visual Studio 2010 (vc2010) for the examples. Old versions of C++ do not support typeof or decltype. Although it is possible to play without them, declarations are slightly more complex.

Interface and Implementation. Sorry?

When I started programming, I thought a good implementation of an algorithm was superb, the important thing. Let's write a body function 1000 times faster, a poor implementation can be fixed later. I was totally wrong. One of the main problems in the created software is the nightmare of the versions. You can improve a poor implementation anytime, without side effects, but a change in an interface means your old code stops building.

This code is about implementation. It knows perfectly what to do and it tells the compiler in a simple way. This means your code can be used by a wide audience. Think about stl, now part of C++. As a tiny example, lack of ".h" on include files was a consequence of an implementation change. The implementation is about you telling things to a foreign code or to a compiler. The tentative templates give a lot of smartness, so you can define a template used in regular cases, which can be ignored on special cases, thus an interface with "adaptable complexity".

The Explosion of Java: Why?

Now I would like to reflect on the previous points, because that is where I think this is the reason for the success of Java, and is the heart of the operation of this library. Compilers traditionally translate from human language to numbers. This makes all references to variable names and their manipulation disappear at compile time. The problem is that object name manipulation at runtime is something that over time has been essential. In fact, we should not take a look at the blog with the programmers begging for a "const char * nameof ()" or something similar. This means that a bridge between human language and the program is needed. All this in Java is quite natural, but in C ++ there has been traditional resistance to building this bridge. There is a dark function called typeid; that acts as a bridge for variable types, but not in its original specification. This was precious time wasted. Here, the turtle (Java) overtakes the hare (C++).

Another important feature is the lack of makefiles or build systems. I think all the necessary things to build a piece of code (dependences, libraries, compiler switches, etc.) must lay on source files. Build systems are so abused that they have become a source of entropy.

Using the Code

Although I'm using this, and it is really small, there are possible improvements. The input and output streamers are hardcoded to the disk, and this is simply not necessary. The JSON parser is not tested enough, etc., so I'll give the repository address:

svn checkout https://svn.code.sf.net/p/unicodecs/code/keepconf

The legibility for both the templates and the event oriented programming, used in xml.c and json.c, the ANSI C parsers have been traditionally tricky. The good thing is that these parsers are really simple, and can be used standalone. Also, implement another format supports (let's say yaml) is simple.

The Future

A way to have C++ used widely on the web, and millions of high quality of code reused for this, can be complete C++ interface with stuff learned from Java. Once "#define" has disappeared from non trivial code (and makefiles also), we will know the job is done (in my opinion).

History

  • 26th May, 2019: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)