Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Restore Class View Add-In 2.0

0.00/5 (No votes)
21 Jan 2004 2  
An add-in to restore class view information for a workspace, using binary or XML format

This project's ClassView structure

Introduction

I was reading Jignesh Patel's article here on CP and I found someone asking for a text version of the ClassView information. Actually, I needed that too for a project where I wanted to rename some classes and source files, so I took the challenge and tried to decode ClassView binary data.

This is for you, Uwe! ;-)

Binary formats and Hex editors

Well, since the ".opt" file is a Compound Document (or Structured Storage) file I opened it with the DocFile Viewer utility that comes with Visual Studio and I used the internal Hex viewer to try to decode the "ClassView Window" stream.

The first thing I noted was the length of the stream reported on the window's title bar with respect to what seemed the actual data: some strings of text and a few other bytes of header. The rest of the stream is just garbage, so it's sufficient to save the stream length and fill it up with zeroes while loading.

The next thing to do when you find strings in a block of binary data is to see if they're zero terminated, like C strings, or prefixed with their length, like Pascal or Basic strings, or maybe fixed length. I found all strings were prefixed with a BYTE or a WORD, matching the string's length.

The other fields all seemed like WORD or DWORD values or flags. Some could be identified as the project's count in the workspace, or the folder's count in a project, or the number of subfolders and classes in a folder.

A strange thing was the presence of two unknown class names before the first project in the workspace and before the first folder of the first project: CClsFldSlob and CClassSubfolderSlob. I suspected this classes were owned by the Visual Studio IDE, so I used Nick Hodapp's OpenVC add-in and looked for them:

CClsFldSlob
CClassSubfolderSlob

Once found, I started thinking if I could use this information to my purposes, but my research about those two classes ended there (see the Addendum). Maybe their presence among the other data is due to the process used to serialize data into and out of the stream, maybe it's just the MFC's serialization support, but I can't tell it because I never used that.

No problem, this extraneous data can be easily identified by a WORD prefix of 0xFFFF. Then comes another WORD of value 0x0001 or 0x0002 (that I called "slob level", but it actually is the object schema of MFC), the class name and then the first project or folder of the whole workspace. The other projects have a WORD prefix of 0x8001, while folders have a WORD prefix of 0x8003 (both are special index values, see the Addendum).

All these could be flags or integer numbers. I chose not to explicitly write to XML the first WORD prefix, which is implicit in the type of item considered (project, folder or "slob" - dropped in last version, see the Addendum).

For each project there's a DWORD count of the project's folders, while for each folders there are two DWORDs counting the number of subfolders and contained classes. The tree is serialized with an anticipated visit, that is subfolders are nested just after the subfolders count.

All the details can be found in the CCLVParser class, but don't expect a documentation for the ClassView data format, just commented source code.

Addendum: MFC Serialization

I was trying to investigate the problems reported by some users of this add-in and decided to have a look at the way MFC handles object serialization. I was right, the binary data in the "ClassView Window" stream is compatible with the serialization support of MFC, that almost certainly was used to read and write the stream.

Let's skip the first DWORD of the stream, that I consider like a signature because it never changes, but that may have a deeper meaning. The next field is a DWORD count of the projects in the workspace, followed by a representation of each project. This is very close to how MFC collections are stored during serialization. Take for example the Serialize method of CObArray or CObList, that could be conveniently used to store the list of projects in a workspace:

void CObArray::Serialize(CArchive& ar)
{
    ASSERT_VALID(this);
    CObject::Serialize(ar);
    if (ar.IsStoring())
    {
        ar.WriteCount(m_nSize);
        for (int i = 0; i < m_nSize; i++)
            ar << m_pData[i];
    }
    else
    {
        ...
    }
}

Obviously this code cannot be the source of our stream, because the object itself is first serialized and there's no trace of MFC collection classes in the data stream before the initial count, but I hope you got the picture. First the object count is stored in the stream, then each object in the array or list gets stored in turn. This same pattern is used for the list of folders in a project and the classes they contain.

Other things I discovered about ClassView, using the OpenVC Add-in and information stored in the stream, are:

  • A project is represented by an object of class CClsFldSlob
  • A folder is represented by an object of class CClassSubfolderSlob
  • The other container items (classes and the Globals folder) are represented by objects of class CClassViewSlob, from which the above two are derived

Class items and the Globals folder are both implemented as containers of variables and functions, but as objects they are not serialized in our stream. They are probably generated from the source code each time we open the workspace and then associated to folders as specified by the ClassView stream.

Project and folder items, instead, are stored exactly the same way MFC implement serialization. When an object is serialized through CArchive::WriteObject, its RUNTIME_CLASS is serialized too, and both are written to the stream only the first time, using an index to reference them subsequent times. This way MFC avoids to duplicate RUNTIME_CLASS information and handles multiple references to the same object. The RUNTIME_CLASS is placed before the object to locate the CRuntimeClass::CreateObject function when the object is read back from the stream. Here follows a stripped down version of the WriteObject function:

void CArchive::WriteObject(const CObject* pOb)
{
    if ((nObIndex = (DWORD)(*m_pStoreMap)[(void*)pOb]) != 0)
    {
        // save out index of already stored object

        *this << (WORD)nObIndex;
    }
    else
    {
        // write class of object first

        CRuntimeClass* pClassRef = pOb->GetRuntimeClass();
        WriteClass(pClassRef);
        // enter in stored object table

        (*m_pStoreMap)[(void*)pOb] = (void*)m_nMapCount++;
        // cause the object to serialize itself

        ((CObject*)pOb)->Serialize(*this);
    }
}

The objects in our stream are unique, so let's consider the alternative branch of the condition. The WriteClass function writes a WORD prefix defined as 0xFFFF, then it writes information sufficient to identify the RUNTIME_CLASS. In the same file (ARCOBJ.CPP), which is part of the MFC 6.0 sources, we can find the definitions of some constants that we can find in the ClassView stream:

#define wNewClassTag    ((WORD)0xFFFF)      
    // special tag indicating new CRuntimeClass

#define wClassTag       ((WORD)0x8000)      
    // 0x8000 indicates class tag (OR'd)

The first is the special prefix we have just seen, that precedes any new RUNTIME_CLASS definition. After this prefix we find the object schema as a WORD value (which is 0x0001 for project items and 0x0002 for folders), and the class name as a counted string with the character count also stored as a WORD. Take a look at the source, where inessential code has been removed:

void CArchive::WriteClass(const CRuntimeClass* pClassRef)
{
    if ((nClassIndex = (DWORD)(*m_pStoreMap)[(void*)pClassRef]) != 0)
    {
        // previously seen class, write out the index tagged by high bit

        *this << (WORD)(wClassTag | nClassIndex);
    }
    else
    {
        // store new class

        *this << wNewClassTag;
        pClassRef->Store(*this);
        // store new class reference in map

        (*m_pStoreMap)[(void*)pClassRef] = (void*)m_nMapCount++;
    }
}
void CRuntimeClass::Store(CArchive& ar) const
    // stores a runtime class description

{
    WORD nLen = (WORD)lstrlenA(m_lpszClassName);
    ar << (WORD)m_wSchema << nLen;
    ar.Write(m_lpszClassName, nLen*sizeof(char));
}

As you can see RUNTIME_CLASS information is stored only the first time, as with multiply referenced objects, while subsequent times only an WORD index is stored, OR'ed with the special value 0x8000, that we can see in the ClassView stream for projects and folders following the first.

After class information, either complete or with just the indexed reference, follows a list of folders for a project item and also a list of classes for a folder item. Serialization of folders, that can be nested, follows the same pattern of projects in a workspace, a DWORD count followed by serialization of each folder object in turn. Classes are stored more simply with their DWORD count followed by the name of each class as a counted string, with a BYTE length. I suspect that generic strings like these, but not the class names as we have seen above, are stored again using MFC provided functions and that is:

CArchive& operator<<(CArchive& ar, const CString& string)
{
    if (string.GetData()->nDataLength < 255)
    {
        ar << (BYTE)string.GetData()->nDataLength;
    }
    else if (string.GetData()->nDataLength < 0xfffe)
    {
        ar << (BYTE)0xff;
        ar << (WORD)string.GetData()->nDataLength;
    }
    else
    {
        ar << (BYTE)0xff;
        ar << (WORD)0xffff;
        ar << (DWORD)string.GetData()->nDataLength;
    }
    ar.Write(string.m_pchData, string.GetData()->nDataLength);
    return ar;
}

Unfortunately, I could not verify this hypothesis since my class names were always shorter than 255 characters and I also doubt the compiler would accept identifiers any longer than that, but the data in the stream is compatible with this code and it is another clue that MFC serialization support has been used for ClassView data.

As a last note, we can see that empty lists of folders and classes are stored with the only intial count set to zero. Indexed references to serialized classes use an integer variable that is incremented each time a new class or a new object is written to the stream, and the new version of this Add-in reflects this consideration, removing the unneeded <SLOB> tag from the XML format I used for the stream (old files are still interpreted correctly).

Add-In Usage

Using the add-in is pretty simple. See this page for installation instructions if you don't already know how to install add-ins. After installing it, the add-in is now ready to use. This is how the add-in's toolbar looks like:

Add-In default toolbar

You have three buttons:

  1. About RestoreClassView
  2. Restore ClassView folders
  3. Save ClassView folders

They all should be self-explaining. You may choose between binary (default) and XML format and your choice is recorded in the registry, under the key "HKCU\Software\The Code Project\RestoreClassViewAddin".

If you rename a class in your project you may lose all your workspace's folders. You can then use the XML file format to recover your folders: just replace the class name in the XML file and reload it with the RestoreClassView Add-in.

After all, that's what all this is meant to.

Updates

19 Jan 2004

Changed parsing routines and XML format, with a more robust restore operation from XML.

15 Apr 2002

Fixed a bug: the Restore procedure was always looking for the last saved file, regardless of the user's choice.
I forgot a sentence in the article.

Acknowledgements

This project is based on the following people's work:

My changes for Version 2.0 of the add-in:

  • Three separate toolbar buttons to access the add-in's functionalities
  • XML conversion of ClassView data

Any comment or suggestion is appreciated.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here