This article is the second of a 3 part tutorial on serialization.
- Part 1 introduces the basics of serialization.
- Part 2 explains how to gracefully handle reading invalid data stores and support versioning.
- Part 3 describes how to serialize complex objects.
In Part 1, we saw how to serialize a simple object via a CArchive
using a serialize()
method like this:
int CFoo::serialize
(CArchive* pArchive)
{
int nStatus = SUCCESS;
ASSERT (pArchive != NULL);
TRY
{
if (pArchive->IsStoring()) {
(*pArchive) << m_strName;
(*pArchive) << m_nId;
}
else {
(*pArchive) >> m_strName;
(*pArchive) >> m_nId;
}
}
CATCH_ALL (pException)
{
nStatus = ERROR;
}
END_CATCH_ALL
return (nStatus);
}
There's a problem with this code. What if we mistakenly read a datafile that doesn't contain the expected information? If the datafile doesn't contain a CString
followed by an int
, our serialize()
method would return ERROR. That's nice, but it would be better if we could recognize the situation and return a more specific status code like INVALID_DATAFILE
. We can check that we're reading a valid datafile (i.e., one that contains a CFoo
object) by using an object signature.
Object Signatures
An object signature is just a character string (e.g.: "FooObject
") that identifies an object. We add a signature to CFoo
by modifying the class definition:
class CFoo
{
...
public:
...
CString getSignature();
...
protected:
static const CString Signature;
};
The signature is declared in Foo.cpp:
const CString CFoo::Signature = "FooObject";
Next, we modify the serialize()
method to serialize the signature before serializing the object's data members. If an invalid signature is encountered, or if the signature is missing, it's likely that we're attempting to read a data store that doesn't contain a CFoo
object. Here's the logic for reading a signed object:
And here's the code:
int CFoo::serialize
(CArchive* pArchive)
{
int nStatus = SUCCESS;
bool bSignatureRead = false;
ASSERT (pArchive != NULL);
TRY
{
if (pArchive->IsStoring()) {
(*pArchive) << getSignature();
(*pArchive) << m_strName;
(*pArchive) << m_nId;
}
else {
CString strSignature;
(*pArchive) >> strSignature;
bSignatureRead = true;
if (strSignature.Compare (getSignature()) != 0) {
return (INVALID_DATAFILE);
}
(*pArchive) >> m_strName;
(*pArchive) >> m_nId;
}
}
CATCH_ALL (pException)
{
nStatus = bSignatureRead ? ERROR : INVALID_DATAFILE;
}
END_CATCH_ALL
return (nStatus);
}
You should ensure that all your objects have unique signatures. It's less important what the actual signature is. If you're developing a suite of products, it's helpful to have a process for registering object signatures companywide. That way, developers won't mistakenly use the same signature for different objects. If you want to make it harder to reverse engineer your datafiles, you should use signatures that have no obvious connection to object names.
Versioning
As you upgrade your product during its lifecycle, you may find it necessary to modify the structure of CFoo
by adding or removing data members. If you simply released a new version of CFoo
, attempts to read old versions of the object from a data store would fail. This is obviously not acceptable. Any version of CFoo
should be able to restore itself from an older serialized version. In other words, CFoo
's serialization method should always be backward compatible. This is easily accomplished by versioning the object. Just as we added an object signature, we add an integer constant that specifies the object's version number.
class CFoo
{
...
public:
...
CString getSignature();
int getVersion();
...
protected:
static const CString Signature;
static const int Version;
};
The object's version is declared in Foo.cpp.
const CString CFoo::Signature = "FooObject";
const int CFoo::Version = 1;
Next, we modify the serialize()
method to serialize the version after serializing the signature, and before serializing the object's data members. If a newer version is encountered, we're attempting to read an unsupported version of the object. In this case, we simply return the status UNSUPPORTED_VERSION
.
int CFoo::serialize
(CArchive* pArchive)
{
int nStatus = SUCCESS;
bool bSignatureRead = false;
bool bVersionRead = false;
ASSERT (pArchive != NULL);
TRY
{
if (pArchive->IsStoring()) {
(*pArchive) << getSignature();
(*pArchive) << getVersion();
(*pArchive) << m_strName;
(*pArchive) << m_nId;
}
else {
CString strSignature;
(*pArchive) >> strSignature;
bSignatureRead = true;
if (strSignature.Compare (getSignature()) != 0) {
return (INVALID_DATAFILE);
}
int nVersion;
(*pArchive) >> nVersion;
bVersionRead = true;
if (nVersion > getVersion()) {
return (UNSUPPORTED_VERSION);
}
(*pArchive) >> m_strName;
(*pArchive) >> m_nId;
}
}
CATCH_ALL (pException)
{
nStatus = bSignatureRead && bVersionRead ? ERROR : INVALID_DATAFILE;
}
END_CATCH_ALL
return (nStatus);
}
Version 1 of our CFoo
contained 2 data members - a CString
(m_strName
) and an int
(m_nId
). If we add a third member (e.g.: int
m_nDept
) in version 2, we need to decide what m_nDept
should be initialized to when reading an older version of the object. In this example, we'll initialize m_nDept
to -1
implying that the employee's department code is "Unknown
".
class CFoo
{
...
public:
CString m_strName;
int m_nId;
int m_nDept;
};
We also need to increase the object's version number in Foo.cpp to 2
.
const int CFoo::Version = 2;
Finally, we modify the part of serialize()
that reads the object so that m_nDept
is initialized to -1
if we're reading an older version of the datafile. Note that the file is always saved as the latest version.
int CFoo::serialize
(CArchive* pArchive)
{
...
ASSERT (pArchive != NULL);
TRY
{
if (pArchive->IsStoring()) {
...
(*pArchive) << m_strName;
(*pArchive) << m_nId;
(*pArchive) << m_nDept;
}
else {
...
(*pArchive) >> m_strName;
(*pArchive) >> m_nId;
if (nVersion >= 2) {
(*pArchive) >> m_nDept;
}
else {
m_nDept = -1;
}
}
}
CATCH_ALL (pException)
{
nStatus = bSignatureRead && bVersionRead ? ERROR : INVALID_DATAFILE;
}
END_CATCH_ALL
return (nStatus);
}
Conclusion
So far, we've dealt with providing robust support for serializing simple objects - i.e., those that contain readily serializable data types. In Part 3, we'll see how to serialize any kind of object.