Download Serialization_v0.1.zip
Introduction
I needed to integrate a serialization mechanism (saving and loading of data) to my application but I failed to find appropriete library which fulfils my requirements. I focused mostly on boost.serialization library which also played significant role in designing this library which I present here. In my application I am using custom allocators for all my objects and it was impossible to integrate it with boost.serialization code. Then I decided to write my own library to fulfil all my requirements.
Background
I am using MFC serialization (CObject::Serialize
) for a long time, but I didn't want to integrate MFC into my code, so this was out of options. The goals from http://www.boost.org/doc/libs/1_61_0/libs/serialization/doc/ were mostly applicable to my application but not all of them (like portability). In my library I focused on following goals:
- Serialization methods/functions without templates - I need to export serialization code from DLLs and using templates just makes it harder
- Custom memory management (arbitrary not only custom allocators, class
new
/delete
operators) - Independent versioning for each class definition. That is, when a class definition changed, older files can still be imported to the new version of the class.
- Deep pointer save and restore. That is, save and restore of pointers saves and restores the data pointed to.
- Proper restoration of pointers to shared data.
- Serialization of STL containers and other commonly used templates.
- Non-intrusive. Permit serialization to be applied to unaltered classes. That is, don't require that classes to be serialized be derived from a specific base class or implement specified member functions. This is necessary to easily permit serialization to be applied to classes from class libraries that we cannot or don't want to have to alter.
- The archive interface must be simple enough to easily permit creation of a new type of archive.
- Support for complex migration (described below in the article)
The points 3-8 are copied from boost link above.
I won't focus on boost.serialization in this article. I think the library is great and it is useful in many cases and it has also great documentation. I have to say also that I am not expert in boost.serialization. The problems which I had in my code I just believed that were impossible to overcome and own library was the only way how to go. At the end I think the library which I wrote can be very useful for anybody and I didn't want to keep it for myself.
Building the library
The library is using boost and visual leak detector.
http://www.boost.org/
https://vld.codeplex.com/
Download libraries from the above links, extract them on your HDD, edit StartVS.bat file to setup the paths of the libraries and start visual studio by executing this batch file. SerializationTests project requires to have built boost test framework library but it is not needed to build the library itself.
Using the code
Let's start with example (it is example from attached zip file but adapted to be as single file):
#include <Serialization/Archive/BinaryOutArchive.h>
#include <Serialization/Archive/BinaryInArchive.h>
#include <Serialization/Archive/OutArchiveStdFunctors.h>
#include <Serialization/Archive/InArchiveStdFunctors.h>
#include <Serialization/File/MemoryBinaryOutFile.h>
#include <Serialization/File/MemoryBinaryInFile.h>
#include <Serialization/DeclareMacros.h>
#include <Serialization/ImplementMacros.h>
#include <string>
using namespace Serialization;
class MyData
{
public:
MyData()
: m_Value(0)
{ }
MyData(int value, const std::string& name)
: m_Value(value)
, m_Name(name)
{ }
~MyData()
{ }
int GetValue() const { return m_Value; }
void SetValue(int value) { m_Value = value; }
const std::string& GetName() const { return m_Name; }
void SetName(const std::string& name) { m_Name = name; }
private:
int m_Value;
std::string m_Name;
};
void Save(BinaryOutArchive& ar, const MyData& data, const int )
{
ar << data.GetValue();
ar << data.GetName();
}
void Load(BinaryInArchive& ar, MyData& data, const int )
{
int value;
std::string name;
ar >> value;;
ar >> name;
data.SetValue(value);
data.SetName(name);
}
DECLARE_TYPE_INFO_STRING_KEY(MyData);
IMPLEMENT_TYPE_INFO(MyData, "MyData", 0);
REGISTER_KEY_SERIALIZATION(BinaryInArchive, BinaryOutArchive, const char*);
REGISTER_CLASS_SERIALIZATION(BinaryInArchive, BinaryOutArchive, MyData);
void main()
{
void* pBuffer = nullptr;
size_t bufferSize = 0;
try
{
MyData data1(1, "MyData#1"), data2(2, "MyData#2"), data3(3, "MyData#3");
MemoryBinaryOutFile fo(1024);
BinaryOutArchive ao(fo);
ao << data1 << data2 << data3;
pBuffer = fo.Release(bufferSize);
printf_s("Buffer created: %d\n", bufferSize);
printf_s("\tData1: value = %d; name = %s\n", data1.GetValue(), data1.GetName().c_str());
printf_s("\tData2: value = %d; name = %s\n", data2.GetValue(), data2.GetName().c_str());
printf_s("\tData3: value = %d; name = %s\n", data3.GetValue(), data3.GetName().c_str());
}
catch(SerializationException& e)
{
printf_s("Save error: %s\n", e.what());
return;
}
printf_s("-------------------------------------------\n");
try
{
MyData data1, data2, data3;
MemoryBinaryInFile fi(pBuffer, bufferSize);
BinaryInArchive ai(fi);
ai >> data1 >> data2 >> data3;
printf_s("Loaded data:\n");
printf_s("\tData1: value = %d; name = %s\n", data1.GetValue(), data1.GetName().c_str());
printf_s("\tData2: value = %d; name = %s\n", data2.GetValue(), data2.GetName().c_str());
printf_s("\tData3: value = %d; name = %s\n", data3.GetValue(), data3.GetName().c_str());
}
catch(SerializationException& e)
{
printf_s("Load error: %s\n", e.what());
}
free(pBuffer);
bufferSize = 0;
}
Serialization of user classes is enabled through specialization of templates. To make the task easier the library provides set of macros which generate required specializations. In above example those macros are used here:
DECLARE_TYPE_INFO_STRING_KEY(MyData);
IMPLEMENT_TYPE_INFO(MyData, "MyData", 0);
REGISTER_KEY_SERIALIZATION(BinaryInArchive, BinaryOutArchive, const char*);
REGISTER_CLASS_SERIALIZATION(BinaryInArchive, BinaryOutArchive, MyData);
Macro DECLARE_TYPE_INFO_STRING_KEY
defines a type of key which is bind together with class. The key is required for serializing the pointers to the classes. The archive needs to store a type info of serialized pointer so later it will be able to create exactly the type which was stored and load data into it. The macro DECLARE_TYPE_INFO_STRING_KEY
binds a string key type with the class. Library supports arbitrary key types through general macro DECLARE_TYPE_INFO(class_name, key_type_name)
.
Macro IMPLEMENT_TYPE_INFO
binds a key value with the class. This is exactly the value which archive will store to the file if a type must be saved. The keys must be unique for all class which support serialization.
Macro REGISTER_KEY_SERIALIZATION
binds a key type with archives. It is possible to use different keys with different archives. If class is serialized, its bound key type must be the same as key type bound to the archive. Otherwise exception is thrown.
Macro REGISTER_CLASS_SERIALIZATION
register a class to the archives so it is possible to store references and pointers of that class to the specified archives.
More advanced example
The AdvancedExample
shows a more realistic usage of serialization. The file AppVersion.h contains a macros to define an application version. Each version changes the data in City.h and hence migration is needed. I added different kind of migrations there - all of them where possible to solve directly in Load
methods.
Important files
City.h
declaration of data types fro serialization
City.cpp
implementation of data types
metaCity.cpp
serialization of data types
Archive.h
declaration of archives used by types in City.h
Archive.cpp
implementation of archives
AppVersion.h
defines current application version (change the #define
)
Application description
The application creates 4 items (cities) with predefined data values. According to application version in AppVersion.h it creates data which are only supported by that version. The data are possible to store to a file and also select target application version. So it is possible to export data, switch the application version to higher version and then import file. Or export file from higher version to lower version and then load the file in that lower version.
Code documentation
Macro description
The DeclareMacros.h contains macros which should be used in header files and the ImplementMacros.h contains macros which should be used in implementation files.
DeclareMacros.h
DECLARE_TYPE_INFO(class_name, key_type_name)
Binds a type with a key type. It is necessary to provide a specialization of DirectValueReader
and DirectValueWriter
templates for the key_type_name
.
DECLARE_TYPE_INFO_STRING_KEY(class_name)
Binds a type with std::string
key type.
DECLARE_TYPE_INFO_WSTRING_KEY(class_name)
Binds a type with std::wstring
key type.
ImplementMacros.h
IMPLEMENT_TYPE_INFO(class_name, key, version_number, ...)
Implementation of DECLARE_TYPE_INFO
macro.
class_name | name of the class |
key | key value (must be unique within all IMPLEMENT_TYPE_INFO usages) |
version_number | Used for enabling loading older archives. Everytime the class member set is changed, the version_number should be increased. |
... | List of parent classes (only those which support serialization) |
REGISTER_CLASS_SERIALIZATION(in_archive_name, out_archive_name, class_name)
Creates specialization of TypedInArchiveObjectBinder
and TypedOutArchiveObjectBinder
. The specialization of those 2 templates expects that the input class has following 2 methods:
void Save(out_archive_name& ar, const int classVersion) const;
void Load(in_archive_name& ar, const int classVersion);
or as stand-alone functions:
void Save(out_archive_name& ar, const class_name& obj, const int classVersion);
void Load(in_archive_name& ar, class_name& obj, const int classVersion);
REGISTER_KEY_SERIALIZATION(in_archive_name, out_archive_name, key_type_name)
Bind a key_type_name with provided archives. Only types which are registered with same key types are possible to serialize with the archives.
TypedSharedPtrHolder.h
DECLARE_SHARED_PTR0, DECLARE_SHARED_PTR1, DECLARE_SHARED_PTR2
Macros enable serialization of shared pointer for the class. The number defines how many parent classes has a type.
class C_no_parents { ... };
DECLARE_TYPE_INFO(C_no_parents);
DECLARE_SHARED_PTR0(C_no_parents, std::shared_ptr<C_no_parents>);
class C_single_parent : public A { };
DECLARE_TYPE_INFO(C_single_parent );
DECLARE_SHARED_PTR1(C_single_parent, std::shared_ptr<C_single_parent>, std::shared_ptr<A>);
class C_two_parents : public B, public A { };
DECLARE_TYPE_INFO(C_two_parents);
DECLARE_SHARED_PTR2(C_two_parents, std::shared_ptr<C_two_parents>, std::shared_ptr<B>, std::shared_ptr<A>);
The library supports only shared pointers for classes which have up to 2 parent classes. But it is not problem to write DECLARE_SHARED_PTR3
, DECLARE_SHARED_PTR4
, etc. if needed.
Customizing archive type
Check BinaryInArchive
and BinaryOutArchive
for details. Important step is to inherit from templates InArchive
and OutArchive
which provides the streaming operators. The InArchive template expects from main archive class that it has a member method:
void Read(void* pBuffer, size_t size);
The OutArchive template expects following method:
void Write(const void* pBuffer, size_t size);
The Read
method can be added by yourself or by using BinaryInFileComposition
as another parent class. The Write method can be also added by yourself or by using BinaryOutFileComposition
as another parent class.
Helper templates
The whole library customize serialization of user types by template specializations. Those specializations are used by archive-object binders described in class description section below.
Non-default constructible classes
If class doesn't provide default constructor, the specialization of ReadConstructDataImpl
and WriteConstructDataImpl
must be provided.
template<typename ArchiveT, typename ObjectT, typename Enabled = void>
struct ReadConstructDataImpl
{
static void Invoke(ArchiveT& , ObjectT* pMemory, const int )
{
...
::new(pMemory) ObjectT(...); }
};
template<typename ArchiveT, typename ObjectT, typename Enable = void>
struct WriteConstructDataImpl
{
static void Invoke(ArchiveT& , const ObjectT& , const int )
{
}
};
Note that last template parameter Enable
is possible to use for group of classes by using std::enable_if
. Let's consider that you have an intermediate class A
which accepts 2 strings and you have iherited 3 classes from A
which have same constructor signature as A
(to be able to call parent constructor). In this case you can write:
template<typename ArchiveT, typename ObjectT>
struct WriteConstructDataImpl<ArchiveT, ObjectT, typename std::enable_if<std::is_base_of<A, ObjectT>::value>::type>
{
static void Invoke(ArchiveT& ar, const ObjectT& obj, const int )
{
ar << obj.GetString1();
ar << obj.GetString2();
}
};
template<typename ArchiveT, typename ObjectT>
struct ReadConstructDataImpl<ArchiveT, ObjectT, typename std::enable_if<std::is_base_of<A, ObjectT>::value>::type>
{
static void Invoke(ArchiveT& ar, ObjectT* pMemory, const int )
{
std::string s1, s2;
ar >> s1 >> s2;
::new(pMemory) ObjectT(s1, s2);
}
};
Serialization of types without DECLARE_TYPE_INFO
It is possible to serialize types without DECLARE_TYPE_INFO
. However in this case serialization library needs specialization of DirectValueWriter
and DirectValueReader
templates.
struct MyHelperDataType
{
int a, b, c;
};
template<>
struct DirectValueWriter<MyOutArchive, MyHelperDataType>
{
static void Invoke(MyOutArchive& ar, const MyHelperDataType& value)
{
ar << value.a << value.b << value.c;
}
};
template<>
struct DirectValueReader<MyInArchive, MyHelperDataType>
{
static void Invoke(MyInArchive& ar, MyHelperDataType& value)
{
ar >> value.a >> value.b >> value.c;
}
};
The disadvantage is that pointers of the type are not possible to serialize to/from archives. Also the migration of the data is not possible to perform easily if members are changed.
Custom memory allocation support
template<typename ArchiveT, typename ObjectT, typename Enabled = void>
struct AllocateDataImpl
{
static void* Invoke(ArchiveT& , const int )
{
return malloc(sizeof(ObjectT));
}
};
template<typename ArchiveT, typename ObjectT, typename Enabled = void>
struct DeallocateDataImpl
{
static void Invoke(ArchiveT& , const int , void* pMemory)
{
free(pMemory);
}
};
You can write your own specialization of those templates to provide your own allocation.
If library needs to construct a type directly, it uses template class ConstructDefaultValue
. This template is also useful for custrom memory allocation support. For example if it is necessary to read a container of containers from archive, there is no chance to provide custom construction of inner containers. The support for std
containers uses exactly ConstructDefaultValue
template for constructing the stored types so implementation is able to pass custom allocators to the constructor of those containers.
Writing and reading std::unique_ptr
templates can use also some customized deleters. The library provides WriteUniquePtrDeleter
and ReadUniquePtrDeleter
to allow writing custom data bound to deleters.
Support for shared pointers is discussed in separated section.
Serializing parent class content
As the serialization of the classes can be implemented as member methods or stand-alone functions, it is not clear how to serialize parent class data. To unify the calls regardless of parent class serialization implementation, the library contains a template BaseObject<T>
. The template parameter is the parent type. Note that if storing the data, the parent type should be const T
.
void B::Save(MyOutArchive& ar, const int ) const
{
ar << BaseObject<const A>(*this);
}
void B::Load(MyInArchive& ar, const int )
{
ar >> BaseObject<A>(*this);
}
Serialization::Access and friend access
If serialization is implemented using member methods, those methods can be declared as private if Serialization::Access
is given friend access to this class.
classs MyClass
{
friend Serialization::Access;
private:
void Save(MyOutArchive& ar, const int classVersion) const;
void Load(MyInArchive& ar, const int classVersion);
};
Shared pointers support
It is necessary to declare a shared pointer type for the class with macro DECLARE_SHARED_PTR0
(or DECLARE_SHARED_PTR1
or DECLARE_SHARED_PTR2
according to how many parent classes the type has). It is then also necessary to provide specialization of template:
template<typename ArchiveT, typename SharedPtrT, typename Enabled = void>
struct CreateSharedPtrImpl
{
static SharedPtrT Invoke(ArchiveT& , SharedPtrT::value_type* )
{
...
}
};
Together with above template specialization, the library expect also following specialization:
template<typename T>
struct SharedPtrValueGetter
{
static void* Invoke(const T& sharedPtr)
{
...
}
};
template<typename WeakPtrT>
struct ToSharedPtr
{
using SharedPtrT = ...; static SharedPtrT Invoke(const WeakPtrT& ptr)
{
return ...;
}
};
template<typename T, typename U>
struct UpCastSharedPtr
{
U Invoke(const T& )
{
return ...;
}
};
The library has build-in support for std::shared_ptr
and std::weak_ptr
(see StdSharedPtrImpl.h).
Exception handling
Error reporting is done by using an exceptions. All exceptions thrown by the library are inherited from Serialization::SerializationException
and they are placed in Exceptions.h file.
STD containers
The library has build-in support for serializing STD-container. If needed in code, it is necessary to include InArchiveStdFunctors.h and OutArchiveStdFunctors.h. It is possible to also write serialization for boost containers but boost contains so many that I rather didn't write them.
Serialization of template classes
The template classes are tricky. Different template arguments produce different types. The serialization library needs a unique key for every type which should be supported by library.
template<typename T>
class MyTemplate
{
...
};
DECLARE_TYPE_INFO(MyTemplate, std::string);
In the case of templates it is possible to expand and adapt code generated by macro DECLARE_TYPE_INFO
:
namespace Serialization
{
namespace Detail
{
template<typename T>
struct TypeInfoTraits<MyTemplate<T>> {
using value_type = MyTemplate<T>;
using key_type = std::string; };
}
}
but it is not possible to trick IMPLEMENT_TYPE_INFO
in this way because for every type it is necessary to provide a key value (like string in above example). And hence there must be an IMPLEMENT_TYPE_INFO
per instanced template.
IMPLEMENT_TYPE_INFO(MyTemplate<int>, "MyTemplateInt", 0);
IMPLEMENT_TYPE_INFO(MyTemplate<char>, "MyTemplateChar", 0);
IMPLEMENT_TYPE_INFO(MyTemplate<bool>, "MyTemplateBool", 0);
Because it is necessary to write IMPLEMENT_TYPE_INFO
for every instanced template, I write also DECLARE_TYPE_INFO
for those types separately and export instanced template typed from DLL if serialization is placed in separated module.
Archive-object binders
ArchiveObjectBinder
Binds archive and objects types together so it is possible to call specific methods on both types without using common base classes. This class serves as base class for defining interface for input and output archive object binders.
InArchiveObjectBinder
Interface class for reading data from an archive to an object
void Read(BaseInArchive& ar, void* ptr, const int classVersion) const
Read content of an object.
void ReadConstructData(BaseInArchive& ar, void* ptr, const int classVersion) const
Read input data for constructing an object.
void* AllocateObject(BaseInArchive& ar, const int classVersion) const
Allocate an object (constructor is not called yet).
void DeallocateObject(BaseInArchive& ar, const int classVersion, void* pMemory) const
Release a memory previously allocated by AllocateObject call.
void DestructObject(BaseInArchive& ar, const int classVersion, void* pMemory) const
Call a destructor of bound object.
std::unique_ptr
<sharedptrwrapper> CreateSharedPtr(BaseInArchive& ar, void* ptr) const
<sharedptrwrapper>Wrap created and loaded pointer to a shared pointer wrapper. Described in Shared pointers support section.
void GetInputObjects(BaseInArchive& ar, void* ptr, LoadedPointerInfoArray& inputObjects) const
Support for complex object migration. Described in Complex migration support section.
bool PostLoad(BaseInArchive& ar, void* ptr, const int classVersion) const
Support for complex object migration. Described in Complex migration support section.
InArchiveKeyBinder
Interface for reading keys from an archive. The key describes a class type written in archive. It is important archive-object binder to have support of writing/reading polymorphic pointers.
const TypeInfo::Key& Read(BaseInArchive& ar) const
Read a key from an archive.
DeferredInArchiveObjectBinder
Support for reading objects directly like objects stored in containers. This archive-object binder allows to use non-default constructible objects in std::vector like containers.
OutArchiveObjectBinder
Interface class for writing data to an archive from an object.
void Write(BaseOutArchive& ar, const void* ptr, const int classVersion) const
Write a content from the input object.
void WriteConstructData(BaseOutArchive& ar, const void* ptr, const int classVersion) const
Write a data required by constructor of the input object.
OutArchiveKeyBinder
Interface for writing keys to an archive. The key is bound together with class type to support deep writing of the pointers.
void Write(BaseOutArchive& ar, const TypeInfo::Key& key) const
Write a key to an archive.
DeferredOutArchiveObjectBinder
Support for writing objects directly which can be loaded by using DeferredInArchiveObjectBinder. It writes all data required to call a constructor of written object and the content of the object.
template<typename ArchiveT, typename ObjectT> TypedInArchiveObjectBinder
Main implementation of InArchiveObjectBinder
. It bind actuall types of archive and object together. Implementation is customized by using further template classes so it is not necessary to reimplement the class for specific archives and class types but rather just specialization of sub-templates. Input ArchiveT
must inherit from BaseInArchive and input ObjectT
must be non-pointer and cannot be const. The instances of this class is created by using a macro REGISTER_CLASS_SERIALIZATION
. All input parameters of methods are ensured to be a correct types by BaseInArchive
.
void Read(BaseInArchive& ar, void* ptr, const int classVersion) const
Call ObjectT::Load
member method or stand-alone void Load(ArchiveT&, ObjectT&, int)
function.
void ReadConstructData(BaseInArchive& ar, void* ptr, const int classVersion) const
For abstract classes in doesn't do anything and the method shouldn't be called (it throws an exception if called). For non-abstract classes it calls a default constructor. It uses a template class ReadConstructDataImpl
for customizing behavior.
void* AllocateObject(BaseInArchive& ar, const int classVersion) const
Allocate memory for an object. For abstract classes it doesn't do anything and the method shouldn't be called (it throws an exception if called). For non-abstract classes it allocates a memory using malloc
. The behavior can be customized by specializing AllocateDataImpl
template.
void DeallocateObject(BaseInArchive& ar, const int classVersion, void* pMemory) const
Deallocates memory of an object previously allocated by AllocateObject
call. For abstract classes it doesn't do anything and the method shouldn't be called (it throws an exception if called). For non-abstract classes it releases memory using free
. The behavior can be customized by specializing DeallocateDataImpl
template.
void DestructObject(BaseInArchive& ar, const int classVersion, void* pMemory) const
Calls a destructor of an object previously constructed by ReadConstructData
method. The method shouldn't be called on abstract classes (it throws an exception). The reason is that ReadConstructData is not allowed to call on abstract classes so neither this method. The behavior is possible to customize by specializing DestructDataImpl
template.
std::unique_ptr<SharedPtrWrapper> CreateSharedPtr(BaseInArchive& ar, void* ptr) const
Create a shared pointer wrapper. Described in Shared pointers support section.
void GetInputObjects(BaseInArchive& ar, void* ptr, LoadedPointerInfoArray& inputObjects) const
If complex migration is enabled for ArchiveT
, it calls ObjectT::GetInputObjects
member method or stand-alone void GetInputObjects(ArchiveT&, ObjectT&, LoadedPointerInfoArray&)
function. Described in more details in Complex migration support section.
bool PostLoad(BaseInArchive& ar, void* ptr, const int classVersion) const
If complex migration is enabled for ArchiveT
, it calls ObjectT::PostLoad
member method or stand-alone bool PostLoad(ArchiveT&, ObjectT&, const int)
function. Described in more details in Complex migration support section.
template<typename ArchiveT, typename KeyT> TypedInArchiveKeyBinder
Implementation of InArchiveKeyBinder
to read a keys of specific type. The specialization of this template is created by REGISTER_KEY_SERIALIZATION
macro.
template<typename ArchiveT, typename ObjectT> TypedDeferredInArchiveObjectBinder
Implementation of DeferredInArchiveObjectBinder
. The instances of the classes are created directly on stack if object must be loaded directly from the archive.
ObjectT Read(BaseInArchive& ar);
Read an object from archive. The ObjectT
must be movable, but not necessary to be copyable.
template<typename ArchiveT, typename ObjectT> TypedOutArchiveObjectBinder
Main implementation of OutArchiveObjectBinder
. It bind actuall types of archive and object together. Implementation is customized by using further template classes so it is not necessary to reimplement the class for specific archives and class types but rather just specialization of sub-templates. Input ArchiveT
must inherit from BaseInArchive and input ObjectT
must be non-pointer and cannot be const. The instances of this class is created by using a macro REGISTER_CLASS_SERIALIZATION
. All input parameters of methods are ensured to be of correct types by BaseOutArchive
.
void Write(BaseOutArchive& ar, const void* ptr, const int classVersion) const
Call ObjectT::Save
member method or stand-alone void Save(ArchiveT&, const ObjectT&, int)
function.
void WriteConstructData(BaseOutArchive& ar, const void* ptr, const int classVersion) const
For abstract classes in doesn't do anything and the method shouldn't be called (it throws an exception if called). For non-abstract classes it uses a template class WriteConstructDataImpl
for customizing behavior. By default template doesn't do anything but it is possible to write specialization to store input data for constructing a class.
template<typename ArchiveT, typename KeyT> TypedOutArchiveKeyBinder
Implementation of OutIArchiveKeyBinder
to write a key of specific type. The specialization of this template is created by REGISTER_KEY_SERIALIZATION
macro.
template<typename ArchiveT, typename ObjectT> TypedDeferredOutArchiveObjectBinder
Implementation of DeferredOutArchiveObjectBinder
. The instances of the classes are created directly on stack if object must be saved directly to the archive.
void Write(BaseOutArchive& ar, const ObjectT& obj)
Write an object to archive. It writes construct data and content of the object.
Creating files compatible with older version of an application
By general the library doesn't support this. If application requires to export files which should be possible to load with older versions of the same application, I recommend to ignore and not use at all class versioning. The file version when using class versioning system is defined by set of all class versions serialized in the file. In this case it would be necessary to track those sets somehow and to export class versions if one of the classes is about to be changed. Rather I suggest to use a single number to track the application file version and pass it to archive. Then every Save/Load method can access this number and store/load only what was at that time. Good practice in this case is to create an intermediate file version (which is not yet released). If class is about to be changed, the Save/Load method content is copied and kept for saving/loading older versions, then if
statement is added and the code can be adapted. Very similar to using class versioning system but rather to use same number for all classes.
void Save(MyArchive& ar, const int classVersion) const
{
if(ar.GetFileVersion() <= APP_FILE_VERSION_1_0)
{
ar << m_Data1;
}
else if(ar.GetFileVersion() <= APP_FILE_VERSION_2_0)
{
ar << m_Data1;
ar << m_Data2; }
else
{
ar << m_Data1;
ar << m_Data2; ar << m_Data3; }
}
Just note that the last part is active also for 4.0, 5.0, etc. versions. If e.g. class didn't change between 3.0-5.0 version, but in 6.0 it changes, the Save method would look like this:
void Save(MyArchive& ar, const int classVersion) const
{
if(ar.GetFileVersion() <= APP_FILE_VERSION_1_0)
{
ar << m_Data1;
}
else if(ar.GetFileVersion() <= APP_FILE_VERSION_2_0)
{
ar << m_Data1;
ar << m_Data2; }
else if(ar.GetFileVersion() <= APP_FILE_VERSION_5_0)
{
ar << m_Data1;
ar << m_Data2; ar << m_Data3; }
else
{
ar << m_Data1;
ar << m_Data2; ar << m_Data3; ar << m_Data4; }
}
Loading will look similar, but it is very important to not forget about initializing new members when the class is loaded from older archives.
void Load(MyArchive& ar, const int classVersion)
{
if(ar.GetFileVersion() <= APP_FILE_VERSION_1_0)
{
ar >> m_Data1;
m_Data2 = ...;
m_Data3 = ...;
m_Data4 = ...;
}
else if(ar.GetFileVersion() <= APP_FILE_VERSION_2_0)
{
ar >> m_Data1;
ar >> m_Data2; m_Data3 = ...;
m_Data4 = ...;
}
else if(ar.GetFileVersion() <= APP_FILE_VERSION_5_0)
{
ar >> m_Data1;
ar >> m_Data2; ar >> m_Data3; m_Data4 = ...;
}
else
{
ar >> m_Data1;
ar >> m_Data2; ar >> m_Data3; ar >> m_Data4; }
}
So if new member is added, it is necessary to add it to all sections. I don't recommend to do any optimization here like trying to avoid duplicate code:
void Load(MyArchive& ar, const int classVersion)
{
ar >> m_Data1;
if(ar.GetFileVersion() >= APP_FILE_VERSION_2_0)
ar >> m_Data2; else
m_Data2 = ...;
if(ar.GetFileVersion() >= APP_FILE_VERSION_5_0)
ar >> m_Data3; else
m_Data3 = ...;
if(ar.GetFileVersion() >= APP_FILE_VERSION_6_0)
ar >> m_Data4; else
m_Data4 = ...;
}
At first it is mess and believe me that migration can get ugly in meantime and it is very easy to make a mistake in the code like this. The most important rule is that if application is already released to public and files should be possible to export or import from that released appplication, it is better to keep the code unchanged. This is the reason why it is better to copy it, wrap into condition and alter only copied code for the current application version.
Complex migration support
The data migration is used when classes are changed and it is still necessary to be able to load files already created (and possibly already shipped to customers). The simples type of migration is when classes changed directly its members - like the data type of a member is changed, added or removed. For this simple purpose the library has class versioning. If class version is then increased by every changed, it is possible to verify the class version during loading and adapt the data accordingly. However the real examples are more complicated and it is not possible always to migrate data of the class with Load
method/function. During the life-cycles of an application a single class can grow rapidly and it then can be necessary to split a class to 2 types. Or opposite change - merge 2 types to a single type. Another type of migration can be a data computation based of several objects. During serialization is not possible to define in which order the object are stored/loaded and hence reading data from a member pointer of class can lead to problems. The library has support for executing more complex migration after all the objects are loaded. If such support is required, the input archive (implementation of InArchive
) must used withing a macro ENABLE_ARCHIVE_MIGRATION
. By using this macro, all classes registered with REGISTER_CLASS_SERIALIZATION
must then declare additional interface:
As stand-alone functions:
void GetInputObjects(ArchiveT& ar, T& obj, LoadedPointerInfoArray& inputObjects);
bool PostLoad(ArchiveT& ar, T& obj, const int classVersion);
or member methods:
void GetInputObjects(ArchiveT& ar, LoadedPointerInfoArray& inputObjects);
bool PostLoad(ArchiveT& ar, const int classVersion);
Migration should be placed inside PostLoad
methods/functions. GetInputObjects
defined order in which PostLoad
are called. Let's consider following class:
class A
{
...
public:
B* m_pB;
C* m_pC;
D* m_pD; };
void Save(MyOutArchive& ar, const A& a, const int classVersion)
{
ar << a.m_pB << a.m_pC;
if(classVersion >= 1)
ar << a.m_pD;
}
void Load(MyInArchive& ar, A& a, const int classVersion)
{
ar >> a.m_pB >> a.m_pC;
if(classVersion >= 1)
ar >> a.m_pD;
else
{
...
}
}
Regardless of bad design (the a.m_pB
and a.m_pC
should be ensured that they are nullptr before loading or Load()
function should release pointers if class A
is owner) just consider that class A
needs to extract some data from both B
and C
pointers and initialize m_pD
for classVersion == 0
. In Load()
it is not sure that B
and C
are already loaded - if B
or C
would point back to A
and serialization invokes first storing of pointer to B
, then during loading A
, pointer B
is just partially initialized - constructor of B
was already called, but B::Load()
didn't finish yet and hence not all members are already loaded from the archive. For such cases, the PostLoad
is required. It is necessary to tell input archive, that we would like to receive A::PostLoad
after the B
and C
are initialized. GetInputObjects()
is exactly the function for this purpose:
void GetInputObjects(MyInArchive& ar, A& a, Serialization::LoadedPointerInfoArray& inputObjects)
{
Serialization::AddInputObject(ar, inputObjects, *m_pB);
Serialization::AddInputObject(ar, inputObjects, *m_pC);
}
Filling of LoadedPointerInfoArray
is not possible directly and library rather provides helper template function AddInputObject
to create an item which is then stored to array. This ensures that PostLoad
is first called on B
and C
(order is not defined unless B
or C
specify another class as its input through its GetInputObjects
) and then on A
. So the A
's PostLoad
implementation then can correctly initialize m_pD
member pointer. The notifications are called when MigrationManager::Execute
is called. If classes are inherited from other classes, in both PostLoad
and GetInputObjects
methods, it is necessary to call also parent methods. Again similar as with Load
/Save
, the parent methods/functions should be called through BaseObject
template. Library's InArchiveMigration
has member methods which accept BaseObject
reference and call corresponding notification.
If it is necessary to perform even deeper hierarchical migration involving more objects which are not directly related - e.g. migrate a whole array of objects, MigrationManager
provides concept of packets and migrators. Using this support, all classes should have access to MigrationManager
. The library provides a template InArchiveMigration
which has member method GetMigrationManager()
to provide this access. Just change base class of your custom archive from InArchive
to InArchiveMigration
.
Packets are small classes which just hold data. During serialization the Load
and PostLoad
funtions can collect objects and other data and store them to packet(s). Later when migrators are executed, those packets can be accessed and data from them processed. Packets are managed by MigrationManager
's methods RegisterPacket
, UnregisterPacket
and GetPacket
. The library contains helper template class Serialization::PacketImpl
which binds a packet with specified key type. Opposite to serializable classes, the packets with different key types can be registered to MigrationManager
(as they are never stored to archive, it doesn't play a huge role). It is confortable to implement packets with a static getter to extract packet from the archive:
class MyDataPacket : public Serialization::PacketImpl<std::string>
{
public:
MyDataPacket()
: Serialization::PacketImpl<std::string>("MyDataPacket_key")
{
}
MyDataPacket& Get(MyArchive& ar)
{
MyDataPacket ref; auto* pPacketRawPtr = static_cast<MyDataPacket*>(ar.GetMigrationManager().GetPacket(ref.GetKey()));
if(pPacketRawPtr == nullptr)
{
auto pPacketPtr = std::make_unique<MyDataPacket>();
pPacketRawPtr = pPacketPtr.get();
if(!ar.GetMigrationManager().RegisterPacket(std::move(pPacketPtr)))
{
throw std::runtime_error("Packet was not registered!!");
}
}
return *pPacketRawPtr;
}
private:
... };
Disadvantage is to have a type per packet so the same type of packet cannot be registered twice. On the other hand reusing packets needs to maintain some kind of list of keys.
Migrators are classes for doing a complex migration. It is the last stage of migration after the whole document is already loaded and migrated/initialized in PostLoad
methods. The migrators are registered also during Load/PostLoad calls and usually together with creating packets. The order in which the migrators are called is not possible to define and it is not a good practice to create a dependencies there. If there dependencies, it is better to merge 2 or more migrators together and solve dependencies rather within a single migrator.
MigrationManager description
bool Execute(BaseInArchive& ar)
Calls GetInputObjects
on all loaded objects (pointers) from archive, builds dependency graph and calls PostLoad
notifications in requested order.
template<typename T> void AddExternObject(BaseInArchive& ar, T& inputObject)
Registers extern pointer to the archive. During loading the archive doesn't store addresses of loaded references because those can be invalidate later by loaded other objects. Just consider stored objects in std::vector
. If new object is inserted, the storage is reallocated. The archive cannot track changes like this. It is up to the user to register such objects to MigrationManager
if he wants that the notifications are called on the objects. Note that all objects added through GetInputObjects
will receive also PostLoad
notification even if they were loaded as references from archive. So the AddExternObject
should be called mainly with top-level references which are not input objects of any other objects.
void AddMigrator(MigratorPtr pMigrator)
Register a migrator with MigrationManager. The type must be unique. Trying to register 2 migrators of the same types will fail.
Migrator* FindMigrator(const type_info& migratorInfo) const
Find a migrator by type. Returns nullptr if not found.
bool RegisterPacket(PacketPtr pPacket)
Register a packet with MigrationManager. If the packet holds a key which was used already by another packet, registration fails.
void UnregisterPacket(const TypeInfo::Key& key)
Unregister already registered packet. If key was not used by any packet, the method doesn't do anything.
Packet* GetPacket(const TypeInfo::Key& key) const
Find a packet by key. Returns nullptr if packet was not found.
Compilation errors, template mess, etc.
Initially I wanted to have a library which is possible to integrate easily. For me the easy integration means that it is really easy to solve compilation errors if a mistake is made during integration. I think I failed here and the compilation errors are strange as those popping when boost serialization library is tried to integrate. The main problem is with templates themselves. If a template is specialized, it is completelly a new class where it is possible to write whatever a programmer like. The library of course expects to have some features on those classes like static functions with certain interface, but I didn't find a way how to verify that the interface of the class is ok. So then compiler produces a strange error about the call. Other kind of errors come from visibility of templates. If the library expect a specialization of a template, your specialization must be visible at this place. I added as many as possible static_asserts
to the code to make clear what is going wrong but it was not always possible.
Implementation notes
The library uses boost only in ObjectDependencyGraph.cpp for sorting dependency (using boost graph library). The unit tests are also created using boost (boost framework library), but they are not needed for building the library itself.
History
21-06-2016 Initial release
27-06-2016 Added PointerOwnershipTests and fixed bug when loading 2 nullpointers of the same type as std::unique_ptr