A bit of theory
C/C++ is a strongly typed language. Which means, in simple words, that information about variable's type is known only at compile time, not at runtime. All type identifiers (type names, member variable names etc.) behave like "constants", in the sense, that their value must be known at compile time. Hence C/C++ doesn't provide functions like "is the variable X kind of type T?" or "create an instance of type described by (variable) identifier ID".
The advantage of this approach is performance. All the hard work with types is done by compiler. Reading and writing to non-pointer variables are done by simple reading and writing at the (constant) address of memory.
C++ has two exceptions from the strongly typed language paradigm. The first is the virtual function mechanism. Calling a virtual function can launch different routines, depending of actual type of the caller instance type. The advantage is, that the actual instance type needn't be known at compile time. The price is lower speed, because invoking virtual function causes one more dereference operation that invoking static function. The second exception is run-time type info (RTTI). It allows you to compare types of variables (via typeid
) and to determine, whether the variable is derived from a desired type (via dynamic_cast
). Because RTTI works properly even if the variables were statically typecasted, it is an exception from strongly typed paradigm.
Higher-level C++ descendants (Java, C# etc.) and weakly typed languages (Perl, Ruby etc.) provide more runtime features, that are directly embedded in the language. In most cases the strongly typed approach is sufficient. But, for example, serialization can't be done without any runtime information about types. Consider a collection of objects of various types and you would like to construct it from a stream. The type of the object, that you're constructing, is dependent of data from stream. So the information about types is retrieved at runtime, therefore you need any runtime type information.
The general solution of deserializing is called class factory. Class factory is a class, that creates instances of types specified by parameter. The parameter can be a variable, so you can decide at runtime, which type you want to create. The simplest model of class factory works like this:
void* create_instance(int typecode)
{
switch(typecode)
{
case 1: return new ClassA;
case 2: return new ClassB;
...
}
}
This model isn't very elegant, because if you add a new class to the project, you have to browse through the class factory source code and add a line in the
switch
block. Moreover, in all serializable classes you have to declare a static variable (or static function), through that you get typecode of the class. The better model uses
registration, through which you pass information about classes outside of declaration of the class factory. The registration is made by various ways and often use dirty tricks with macros or templates, because strongly typed C++ isn't built to such operations.
Now here should follow the detailed theory about class factories. But Patje was faster and wrote the article Different ways of implementing factories at CodeProject :) I decided not to make duplicate work, so follow the link and enjoy.
Why use this class factory?
This class factory is useful for all developers, which wants to provide serializing independent on MFC. MFC provides quite robust class factory mechanism (via CRuntimeClass
), but all serializable objects must be derived from CObject
.
This class factory works quite independent on other classes and can be bound to almost any object model. You needn't to derive your classes from any root class. The information about classes is separated from the class factory declaration, but also from the classes declaration. Therefore you can register the class without modifying the class source code, even if you have the class implementation only in binary LIB/DLL.
The class factory provide a class tag, in that you can store additional data bound to the given class type. I've written this library to help to my various projects. I know, that it can be "purified" in many ways, to be more STL-compliant. So I would welcome all your suggestions.
Using the class factory
First of all, set Enable Run-Time Type Information in the project settings (C++, category C++ Language).
Next is declaring the class factory. You should declare it in a header file, which must be included in all source files, where you register the classes:
#include "factory.h"
DECLARE_CLASS_FACTORY_EX(Root, Key, Tag, Factory_name)
The Root
parameter is name of root class in your object model. When you create instances via the class factory, they will be casted to Root*
type. If your object model doesn't have a root class, set Root
to void
.
The Key
parameter specifies a type of class key. When you create instances via the class factory, the key specifies the type of desired instance. You can use arbitrary type of key, but it must provide operator<
, because class information is stored in std::map
collection sorted by the key. The Tag
parameter specifies a type of class tag. Every registered class have a Tag
data structure, that contains additional data shared by all instance of the class. It's similar to static members, but to the tag data you have access also via the class key. If you want not to use a class tag, use empty_tag
.
Factory_name
is the name of global object, which represents the class factory. You can use any C++ valid identifier. There is a shorter declaration macro DECLARE_CLASS_FACTORY(Factory_name)
, which is equivalent to DECLARE_CLASS_FACTORY_EX(void, int, empty_tag, Factory_name)
. In some cpp file you should define the class factory:
DECLARE_CLASS_FACTORY(Factory_name)
DECLARE_CLASS_FACTORY_EX(Root, Key, Tag, Factory_name)
with same parameters as in declaration. You can declare more class factories and use them all at once, but you can't declare more class factories with same Root, Key, Tag
parameters.
For class registration use following macros:
REGISTER_CLASS(factory, class_name, key)
REGISTER_TAGGED_CLASS(factory, class_name, key, tag)
It registers the class_name
class to factory
class factory and assign the key
and tag
(if any) with it. The REGISTER_TAGGED_CLASS
can be used only with class factory with nonempty tag type. These macros can be placed in any cpp file and in arbitrary order. They must be in global scope and the factory, class, key and tag type declaration must be known at this point. You can set tag member values somewhere else than the class registration macro:
SET_TAG_PROPERTY(factory, class_name, property, value)
property
is a name of some member of tag class or structure and value
is a constant expression, which will be assigned to property
. You can set different properties at different places, but for one class, you can set every property only once. If you want to combine REGISTER_TAGGED_CLASS
with SET_TAG_PROPERTY
, override the Tag::operator=
. In REGISTER_TAGGED_CLASS
only some members will be assigned via operator=
, and the rest can be set in SET_TAG_PROPERTY
. If some member is set in both macros, the result is undefined.
All registration and tag setting code is executed before execution of main()
. Hence the following methods of class factory can be used anywhere in the program:
const Key key(const type_info& ti);
const Tag& tag(const type_info& ti);
const Tag& tag(const Key key);
Root* create(const Key key);
The first two functions returns key and tag values assigned with the class specified by ti
(see type_info
class in MSDN). The third functions return tag value assigned with given key and the last function creates a instance of type specified by key
and casts it to Root*
. You are responsible to delete these instances properly. If the key
or class with type_info ti
isn't registered in the class factory, the functions throw a not_registered
exception.
The demo program shows a simple example of serialization using class factory.
How it all works
The class information is stored in class_info
:
template<class Root, class Key, class Tag>
class class_info
{
public:
Key key;
Tag tag;
Root* (*create)();
};
The create
pointer function is taken from class class_creator
(a generic class for creating an instance of type T
which is casted to Root*
):
template<class Root, class T>
class class_creator
{
public:
static Root* create()
{
return new T;
}
};
All class_info
s are collected to two std::map
s - first indexed by Key
and second indexed by type_info*
. The second index is used, if you retrieve class info from a given class instance. Don't use type_info
or type_info*
as key type, because it can't be deserialized (you can't obtain type_info
from any integer or string variable) and the member values are compiler-dependent. The registering and tag setting macros define empty global classes of type class_factory_wrapper
, in whose constuctor parameters are the registering function calls. With this dirty trick, all registrations and tag settings are made before main()
starts. However, because we construct global objects, we can't assume any order of registrations and tag settings.
The key
, tag
and create
methods use map::find
to find appropriate class info and if the find fails, they throws an exception. However tag_ref
(used by SET_TAG_PROPERTY
) and register_class
methods use map::operator[]
. Because the order of registration and tag property setting is undefined, both of these functions create a new entry in the class info map, if they doesn't find one.
Undefined order of construction causes another serious problem: some class can attempt to register in the class factory, which hasn't been constructed yet, so it ends with access violation in map::insert
code. That's why the class factory has a static member initial_lock
. This dirty trick uses, that global and static integers have initial value zero. See the class_factory::unlock()
implementation. All class factory methods call unlock()
, which construct both std::map
s before any map operations, but only once within the program run.
History
This article is sooo young to have a history...