Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Debugger support for CRC hash values

0.00/5 (No votes)
4 Mar 2007 1  
Using Expression Evaluator to convert a CRC hash value to a string in the debugger.

Introduction

Often an application needs to refer to objects indirectly. These can be textures, meshes, or other resources in a game engine, plug-ins in a CAD system, records in a database, controls in a GUI system, etc. This is done by assigning each object a unique ID.

One approach is to use text strings as IDs. The big advantage of using text strings is that they can be easily read by us - both in the source and also during debugging. They however have numerous problems. The strings have different lengths. In order to have such IDs in a structure or a class you either need to have a buffer big enough to store the longest string, or you need to allocate the string from the heap � both have their shortcomings. Comparing two strings is a costly operation. If you search for objects by their name in a data structure, comparing keys is what takes the maximum time.

Instead of text strings, you can use the hash values of strings. Identical strings produce identical hash values and different strings usually produce different values. CRC32 is one of the most commonly used. It is a 32-bit value that fits nicely in any structure. The 32-bit values are easily stored in registers. Comparing two CRCs is also fast. For more on the strings vs. CRCs, check out [1].

At Pandemic Studios (where I work) we've been using CRCs for years in our games. We found them to be a very versatile tool. We use them as resource names, object names, file names, for identifying AI scripts and for many other purposes. The main problem with CRCs is that during debugging you only see a plain number. Wouldn't it be nice if you could view the original strings in the debugger?

Let's first start with a simple CRC class implementation.

CRC class implementation

The presented CRC class is pretty straightforward. You can construct a CRC object from a string or from a direct numeric value, you can use a copy constructor, and also compare two CRC objects. It also has an Append method, which allows you to append more characters to a CRC object:

CRC crc1("Hello World");
CRC crc2("Hello");
crc2.Append(" World"); // Now crc1 and crc2 have the same value

Once CRC_STRINGS is defined, the CRC class keeps a global tree structure (a std::map) that contains all the strings in the application. When a new CRC object is constructed from a string the data is added to the tree. This way, the tree structure grows in size for the duration of the application. Every new unique string is added to it, and nothing is removed. You can use the GetStr function to look up the string from the CRC value:

#ifdef CRC_STRINGS
// Resolves the CRC to a string

const char *CRC::GetStr( void ) const
{
    // default string if the CRC is not found in the map

    static const char *null="NULL"; 
    CCRCMap::const_iterator it=s_CRCMap.find(m_Crc);
    if (it!=s_CRCMap.end())
        return it->second;
    else
        return null;
}
#endif

This is how you use it:

printf(crc2.GetStr()) => Hello World

Now comes the tricky question...

How can you do that during debugging?

First try calling GetStr() directly in the watch window

During debugging, you can type crc1.GetStr() in the watch window to get the string. This is not very convenient as you have to type GetStr() for each CRC object that you want to inspect. Since this approach executes code inside the debugged process, it might have unwanted side effects. Because of this, it doesn't work when inspecting a postmortem crash dump.

An improvement using Autoexp.dat

Visual Studio supports user-defined rules for displaying custom types through the file Autoexp.dat. If you add:

CRC=<GetStr()>

at the end of the [AutoExpand] section then the debugger will call the GetStr function when it wants to display a CRC object. At first glance it all seems to work for the simple case above. However, it fails in a more complex situation. Let's have a structure with a CRC member inside it:

struct A
{
    CRC a1;
    CRC a2;
} a;
a.a1=CRC("aaa");
a.a2=CRC("bbb");

If you put the object a in the watch window you get (Visual Studio 2003 and Visual Studio 2005):

a.a1 evaluates properly as a separate expression, but not when a is expanded. Strangely, in VC6 it works fine:

With that approach, you also have the disadvantage of executing the code in the debugged process, and it won't work for postmortem debugging.

The best solution so far: A custom Expression Evaluator DLL

An Expression Evaluator (EE) is a DLL that extends the Visual Studio debugger with support for new types. For more details on this, check out [2]. The evaluator must retrieve the 32-bit CRC value, locate the string tree, find the node for the given value and return the string. Unfortunately, the EE system in Visual Studio is very limited. It only lets you to retrieve data from a given address through the ReadDebuggeeMemory function. So, how to locate the string tree? That's where the VSHelper add-in comes in, check out [3].

Check out the file DebugData.cpp in the CRCTest source code. It creates a 4 K buffer called g_DebugData. The buffer contains pairs of INT_PTRs, the first one is an identifier, and the second is a pointer to some data structure. In our case, the first one is 'CRCV', and the second is a pointer to the head node of the string tree. The last pair has a terminating identifier 0.

When the debugger enters the break mode, either when it hits a breakpoint, when you hit Ctrl+Break, or when you do step by step debugging, it fires an OnEnterBreakMode event. The VSHelper add-in catches the event and evaluates the address of the g_DebugData buffer. The EE DLL can communicate with the VSHelper DLL and get the value.

That's how the CRCView DLL works. Once it gets the head node of the string tree, the rest is easy. It traverses the binary tree until it finds the right key and returns the string. To activate the evaluator add this to Autoexp.dat:

CRC=$ADDIN(<path to the DLL>\CRCView.dll,CRCView)

This solution works in all cases � works great for members of structures, for debugging crash dumps, with debug tooltips, and even for remote debugging:

At Pandemic Studios we've been using a similar add-in for a couple of years now, with no problems.

What about big-endian?

With Visual Studio you can debug a remote system that runs on a different kind of CPU. It can be an embedded system, a cell phone, or a game console. Sometimes the remote machine can be big-endian. The latest version of CRCView.dll will detect that by searching for both 'CRCV' and 'VCRC' identifier. If 'CRCV' is found then the target is little-endian. If 'VCRC' is found then the target is big-endian and the evaluator will byte-swap all values it gets from the debugger:

void BSwap( DWORD *data )
{
    __asm
    {
        mov esi,data
        mov eax,[esi]
        bswap eax
        mov [esi],eax
    }
}

Extending the DebugData system

The g_DebugData buffer supports 511 data pairs. You can register your own data by calling AddDebugData. Give it a unique FourCC identifier and a pointer to your data. Then, write your own EE DLL that searches for that FourCC identifier. For an example on how to do that, check out the FindDebugData function in the CRCView project.

What about VC6?

VC6 doesn't support the OnEnterBreakMode event. The only solution I could find was to place the g_DebugData array on a fixed address (0x3FFF0000 for example). It worked fine when I tested it, but I'm not sure if that address will always be available.

Installation and usage

First download and install the VSHelper add-in [3]. Then, download the CRCView.zip. In the CRCView\Release folder, you'll find the CRCView.dll. Then add this to the [AutoExpand] section of Autoexp.dat:

CRC=$ADDIN(<path to the DLL>\CRCView.dll,CRCView) <- notice there is no space 
                                                     between , and CRCView

In VS 2003 and 2005, the Autoexp.dat file is located in <Visual Studio folder>\Common7\Packages\Debugger. In VC6, the Autoexp.dat file is located in <Visual Studio folder>\Common\MSDev98\Bin.

The last step is to include the CRC class into your own project. Just copy the files CRC.cpp/h and DebugData.cpp/h from the CRCTest folder. Define CRC_STRINGS in the project settings. If you don't define it, the string tree will be disabled and GetStr will be unavailable. The evaluator will not work as well. You may want to use CRC_STRINGS in your debug version, and disable it in release version to save memory.

Troubleshooting tips: what to do if the CRCs are not shown correctly?

Sometimes instead of the correct text you see {m_Crc=<some number>} or {???} in the debugger. If you see {m_Crc=<some number>} then the Autoexp.dat is not modified correctly. Probably:

  • The CRC=$ADDIN... line was not added to the [AutoExpand] setction. If you added the line at the very end, it most likely is inside the [hresult] section
  • Maybe you put the CRC class in some namespace. You have to use the full class name in Autoexp.dat
  • There may be more than one Autoexp.dat files. For example an embedded system uses an alternate debugger within Visual Studio and has its own Autoexp.dat file

If instead you see {???} in the debugger, then the CRC class is found in Autoexp.dat, but there is another problem. Probably:

  • There is a typo in Autoexp.dat (there must be no space between the comma and "CRCView" - see above)
  • The CRCView.dll is not found (check if the path listed in Autoexp.dat is correct)
  • The CRCView.dll doesn't export the CRCView function or exports it with a decorated name - this can happen if you compiled the DLL yourself and forgot to include the DEF file in the linker settings. Use "dumpbin /exports CRCView.dll" to verify what symbols are exported
  • The CRCView function returns an error (no longer the case, see below)

The latest version of CRCView.dll comes with some troubleshooting features. It never returns an error. Instead, if error is detected, it will put an error message in the output text and return S_OK. Possible error messages are:

  • Can't access CRC value - ReadDebuggeeMemoryfailed. Most likely the address of the CRC value is invalid
  • VSHelper is disabled - the evaluator detected the VSHelper add-in, but it did not provide valid g_DebugData value. Most likely the functionality is disabled. Check the VSHelper settings [3]
  • Can't find 'CRCV' data - the g_DebugData was not found. Most likely CRC_STRINGS is not defined in your project settings
  • Can't access CRC table - the g_DebugData was found, but the string table (the std::map) it points to is corrupted
  • No text available - the CRC is not in the string table. This can happen if you create a CRC object from a numeric value directly (For example CRC crc(10); )
  • Failed to retrieve text - ReadDebuggeeMemory failed to access the text from the string table. Most likely the table is corrupted

Licensing

The source code, the binaries and this article are owned by Pandemic Studios. They can be freely used for commercial and non-commercial purposes under the terms of the MIT license. A copy of the license is included in the readme.rtf file in CRCView.zip.

Future development

Currently all the memory for the string database, the map nodes and the strings themselves, is allocated from the CRT heap. This data structure grows in size, and is freed at shutdown. It will be more optimal to remove that load from the CRT heap and use some sort of custom allocator optimized for such behavior. One way is to request big blocks of memory from the heap and do multiple sequential allocations inside them. Another way is to reserve a big chunk of the address space with VirtualAlloc and grow the number of physically allocated pages as needed. It would be nice to add the CRCView.dll to the installer of VSHelper. The installer must also register the DLL in the Autoexp.dat file.

The DebugData system provides a way for the EE add-ins to access arbitrary data from the application. Maybe someone can come up with another cool use of that feature.

Special thanks

My special thanks to Pandemic Studios and to the Full Spectrum Warrior engineering team with lead coder Alex Boczar.

Links

[1] Practical Hash IDs By Mick West, Game Developer Magazine, Dec 05

[2] EEAddIn Sample: Debugging Expression Evaluator Add-In

[3] VSHelper - Visual Studio IDE enhancements

History

  • Jan, 2006: First version
    • Simple CRC32 implementation with Expression Evaluator for Visual Studio
  • Oct, 2006: New features
    • Support for big-endian targets
    • Troubleshooting features
  • Feb, 2007: Published under the MIT license

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here