(untagged)

A garbage collection framework for C++ - Part II

William E. Kempf

0.00/5 (No votes)

28 Jan 2001

Revisiting gc_ptr<> in order to add support for polymorphic types.

Download source files - 4 Kb

Download demo project - 18 Kb

Introduction

This article deals with refactoring the code originally presented in A garbage collection framework for C++ in order to allow polymorphic types to be used. If you've not already done so you should read that article first. This article mostly deals with the process in which the code was refactored to illustrate to new programmers how code is maintained in real life situations. If all that you're interested in is using the gc_ptr<> smart pointer in your code you do not need to read this article. Just download the code from the link above to replace the implementation found in the previous article. You may still find a few sections in here worth reading, specifically the section on compiler bugs and on recommended usage.

The Problem

If you've read the original article you'll remember that although I provided a smart pointer that would give you true garbage collection in C++ the implementation had a serious problem. As coded it would not handle polymorphic types. To illustrate the problem you'll have to understand something about how pointers behave when cast to a base type. The following code will help to illustrate the problem and will be used in the rest of my description here.

struct base1
{
   virtual ~base1() { }
   int value1;
};

struct base2
{
   virtual ~base2() { }
   int value2;
};

struct derived : public base1, public base2
{
};

int main()
{
   derived obj;
   base1* p1 = &obj;
   base2* p2 = &obj;
   assert(static_cast<void*>(p1) == static_cast<void*>(p2));
}

What will surprise many with the example above is that p1' and p2 will not point at the same address in memory. This means the assert above will fail. The original implementation relied on casting to "a pointer to cv void" returning the same value as the start of the memory block allocated by new(gc). Obviously this wouldn't always be the case for polymorphic types.

Looking for a Solution

While trying to think of a solution for this problem it occured to several people that the following code would work.

struct base1
{
   virtual ~base1() { }
   int value1;
};

struct base2
{
   virtual ~base2() { }
   int value2;
};

struct derived : public base1, public base2
{
};

int main()
{
   derived obj;
   base1* p1 = &obj;
   base2* p2 = &obj;
   assert(dynamic_cast<void*>(p1) == dynamic_cast<void*>(p2));
}

The reason this works is that dynamic_cast<void*>(p) will return the address of the most derived type of 'p' (ISO/IEC 14822, section 5.2.7/7). At first this seems to be the perfect solution to our problem. However, dynamic_cast<> can't be used on non-polymorphic types. Modifying gc_ptr<> to use dynamic_cast<> instead of static_cast<> would allow polymorphic types to be used, but would then no longer allow non-polymorphic types. In my experience most types are non-polymorphic, so I could not live with this exchange of problems.

This still seemed to be the obvious solution, however, so I set off to find a way to cast a fully typed pointer to "a pointer to cv void" referencing the most derived type regardless of whether or not the original type were polymorphic. One of the first things I did was to pose this as a question on comp.lang.c++. I made a mistake in wording my question, however. Instead of saying "to the address of the most derived type" I said "to the beginning of the object." Someone quickly pointed out that the standard doesn't dictate how an object is to be layed out and that padding may actually exist in the object before the address of any pointer to the object. I already knew this, so being reminded of it shouldn't have been of any help. Surprisingly, however, it was.

You see, I'd been to close to the problem. I'd been grappling with the complexities of mark-and-sweep, calculating algorithm complexities, addressing corner cases in the language such as the lifetime of global data, etc. So I had blinders on. I focused on the "obvious" solution with out considering other possibilities. The little reminder about object layout concepts for some reason sparked my thinking enough to break out of this narrow view and the solution dawned on me.

What's really surprising is how simple and obvious the real solution was. In fact, when I told the solution to Thant Tessman, the author of a template class called circ_ptr on which I based my original code, he responded with "Duh! I'm embarrassed I didn't think of it." I know how he felt. After all, if we go back to the origins of this code, specifically to traditional garbage collection libraries, we'd find that they use the solution themselves.

So, what's the solution? Stop tracking only pointers to the beginning of the allocated memory block and start tracking pointers that reference any memory within the entire block! Pointers cast to base types will still point within this block. So modifying the code that looks up the "node" in the implementation to search all known nodes until we find one that "contains" the address pointed to takes care of most of this problem. As an added bonus we can now have gc_ptr<>'s that point at members of objects allocated with new(gc).

I said that modifying the code that searches for registered nodes took care of most of the problem. What's missing? There are two things missing at this point. The first is the easiest to solve. The gc_ptr<>::get() method casts back from "a pointer to cv void " that references the base of the allocated memory block back to the actual type. This has the same problem as the cast to "a pointer to cv void", and the solution turns out to be even easier. I simply added a real pointer to gc_ptr<> that's set to the pointer value passed in to the constructor and modified the appropriate assignments to handle this as well.

The second problem left to be addressed is deletion of the object once there's no longer any "root pointers" to it. Again, the implementation relied on casting the base pointer back to a real type in order to call operator delete on it. The solution was to add a second pointer to the node's state. This pointer will point at the object that was used to "register" the destructor with the node (for details on this please see the implementation... this bit is next to impossible for me to explain in text here).

With these changes made we've now got an implementation that handles polymorphic types!

Speed Issues

The first time I modified the code to allow polymorphic types everything worked with out a hitch. However, the code used to find the nodes was not efficient. So another round of refactoring ensued. Several things were moved from the header file to the implementation file instead to reduce compile times for clients, since there was no longer a need for them outside of the implementation. Several functions were broken out into multiple functions. The containers used to track pointers and nodes were changed. And finally, code that searched for nodes was broken out into a single function that returned an iterator.

I'm not sure that I've got the fastest possible implementation for find_node, the function that searches for a node that contains a given address. The first implementation simply stepped through all of the nodes until one was found that contained the address. The final implementation only steps through until either a node is found or the address is greater then the current node's base pointer. This relies on the fact that std::set<> is a sorted container. It would be nice if we could use the set's built in searching capabilities to find our node, but all of them (find, lower_bound, upper_bound and equal_range) use an exact search based on the set's predicate template parameter. So if the address isn't exactly equal to the node's base pointer then all of them will simply return an iterator to the container's end(). The standard algorithms aren't much help either. None of them will be any more efficient then the hand coded for loop, and in fact are likely to be worse. The std::binary_search algorithm won't even help us, because even though the container is sorted, it doesn't support random access iterators.

Compiler Bugs

This section of the article will be a bit controversial. There are many people who think that VC++ has adequate support of the standard and aren't too concerned with the fact that VC++ 7 won't be fully compliant. Well, gc_ptr<> illustrates nicely some areas in which VC++ 6 isn't compliant enough to handle simple code. In this case, I expect VC++ 7 will likely handle the code much better, but this will still illustrate why compliance isn't just a nice feature, but something we should demand. I know that the team in charge of the VC++ compiler is concerned with this and that they are working hard on it, so I won't harp on this subject too much. However, because it's impossible to completely work around the bugs, at least to my satisfaction, I must tell you about the problems I encountered.

After fixing the polymorphic type problem I could start to address the interface in more detail than I had with the first version. I wanted gc_ptr<> to follow the same general style as std::auto_ptr<>, both for familiarity for users as well as to insure I didn't revisit mistakes that the standards comittee addressed along the way. One of the things that was changed was to provide two different constructors and assignment operators for use with other gc_ptr<> instances. In my first implementation there was only a templated form that allowed conversions from one gc_ptr<> type to another. The standard contains versions for assignment and construction from identical gc_ptr<> types as well. I'm not sure exactly why this was necessary, but I'm not going to second guess them. I added the non-templated versions for identical types. Surprisingly, the compiler complained about duplicate definitions. Turns out that VC++ can only handle this if the templated forms come first, which is a parsing bug since the standard does not require this.

Another change was to remove assignment to a "real pointer" and to add a reset() method instead. Assignment to a "real pointer" can result in some subtle and unexpected results in some cases, so it should not be allowed. After making this change I recompiled the code and ran the test harness. Surprisingly an object was being collected prematurely. Or at least that's what appeared to be happening. Stepping through the code revealed that what was really happening was much worse. I had failed to correct a line in the test harness that assigned a gc_ptr<> to a "real pointer". The compiler should have caught this and given me an error at compile time, but it did not. Instead, it produced nothing for this command (the line was skipped while stepping through in the debugger) and continued on as if nothing had happened. I got no error or warning, but instead got an executable that behaved differently from what was expected. I tried to distill this into a simpler example to send to someone in e-mail and found that in the simpler form I received the dreaded INTERNAL COMPILER ERROR. The test harness included in the link above includes the code needed to reproduce both compiler behaviors, as well as a "fix" that if you absolutely must will prevent this on VC++ 6. I don't like the fix because it produces a strange compiler error that will confuse anyone who doesn't know why the fix is there. Since I expect VC++ 7 will fix this, I've left the fix commented out. If you want it for VC++ 6, uncomment it.

Usage Notes

Now that I've got an implementation that's fully functional I think it's appropriate to address when you should use this smart pointer. It's not a good drop in replacement for all pointers, nor even a good replacement for ref-counted smart pointers in all situations. First, if you want to use garbage collection exclusively, you're probably using the wrong language. There is no language support for garbage collection, so all available sollutions have the problem that they don't work with memory that's allocated by other libraries, including the standard library.

If you don't need to use it everywhere, but would prefer to use it most of the time, it may be more appropriate for you to use a traditional C/C++ garbage collection library. They totally replace the allocation logic instead of building on top of the normal routines. This allows them some memory and speed optimizations that are impossible here (or at least undesirable). However, remember the drawbacks discussed with this solution in the previous article. The conservative nature of these libraries may be undesirable despite the benefits in speed and memory use.

Even after you decide that using it only for cases where manual memory management is too difficult and error prone you may want to opt for a more traditional ref-counted smart pointer instead. There is some definate overhead involved with this implementation. There's extra memory that's going to be allocated for every object you allocate and every pointer that exists. The allocation will be slower and calls to collect may cause some definate speed problems if not managed carefully. In contrast, a ref-counted pointer doesn't use much more memory than a standard pointer and will operate nearly as efficiently. For this reason, if speed and/or memory is important to you then the gc_ptr<> class should be used sparingly. Only use it in places where you know there will be circular references, or where you fear there might be. When you know there won't be any circular references, use the ref-counted pointer instead.

Now that I've talked about the inefficiencies of gc_ptr<> I should point out that every effort was made to insure the best possible performance. When used sparingly I'd expect that you'll not notice any problems in this area.

I'd love to hear of real world experiences that you have using this class. If you've got the time, post it in the comments below so that others will know what your experiences are.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here