Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Multithreaded garbage collection for C++

0.00/5 (No votes)
16 Jun 2005 1  
LibGC 3.0: portable multithreaded garbage collection for C++.

Introduction

LibGC is a very small library (less than 500 lines of code) that adds multithreaded garbage collection to C++.

Usage

Here are the main points on how to use the library:

  1. include files "gc.hpp" and "gc.cpp" in your project.
  2. use namespace "gc".
  3. inherit your garbage-collected classes from class Object.
  4. use Pointer<T> for pointers to garbage-collected objects where T is your class derived from class Object.
  5. adjust the garbage collected memory with macro GC_MEMORY_SIZE. The default is 64 MB.
  6. adjust the multitasking capability with macro GC_MULTITHREADED. The default is multithreaded.
  7. do manual collection with the function collectGarbage().
  8. use operator new to allocate garbage-collected objects.

Platforms

LibGC has been tested under Win32 on MSVC++ and DevCpp 4.9.9.2. For using it on other platforms, you have to fill the relevant locking functions:

  • the function 'initLock()' initializes the serialization primitive; called at startup.
  • the function 'deleteLock()' deletes the initialization primitive; called at exit.
  • the function 'lock()' is called to lock the collector from multithreaded access.
  • the function 'unlock()' is called to unlock the collector.

Technical details

LibGC uses the stop-and-copy algorithm: when there is not enough memory, then:

  1. the collector is locked.
  2. the live objects are copied to new space, one after the other (the memory is de-fragmented).
  3. pointers are adjusted accordingly.
  4. the collector is unlocked.

If, during the collection, there is not enough memory, then the process exits with a console error message. Other points to be aware of:

  • garbage-collected objects can be declared on the stack.
  • garbage-collected pointers must be used only with objects allocated with operator new; otherwise if a pointer is assigned from a member object or a stack object, your program will die a painful death.
  • if new objects are not assigned to pointers, they will not be freed. This is necessary because creation of an object can not be done as an atomic operation in C++. It is actually the pointer that unlocks the object for collection. Practically, there is very rare need to create objects on the heap that are not assigned to any pointers, since in C++, they can be declared as stack-based objects.
  • it is only the special Pointer<T> class that must be used with garbage-collected objects. Normal pointers can become dangling any time.
  • garbage-collected objects can still be deleted with operator delete.
  • maximum size of a garbage-collected object is a little less than 4 MB.
  • upon exit, all remaining objects will be finalized in random order.

Performance

I've run the following test both in Java 1.4 with Hotspot optimization and C++: 65000 objects are allocated 100 times. The average time it takes to allocate the objects was measured. I found the result comparable to Java: it was around 100 milliseconds on my Athlon 64 with 1 GB RAM. Most probably C++ will be slower (due to the insane tricks for the whole thing to work), but you must actually test it if you need top performance. Allocation is linear: one block is allocated after the other.

Example

Here is a trivial example that I've used for testing (and it is included in the Zip file). First of all, let's declare our classes:

#include "gc.hpp"


using namespace gc;

class Foo : public Object {

public:
    ~Foo() {}
    int data[500];
};

Then, let's write two threads and run them:

#include <time.h>

#include <windows.h>


DWORD CALLBACK thread_proc(LPVOID params)
{
    Pointer<Foo> p1;
    for(int j = 0; j < 100000; j++) {
        for(int i = 0; i < 65000; i++) {
            p1 = new Foo;
        }
    }
    return 0;
}

int main()
{
    CreateThread(0, 0, thread_proc, 0, 0, 0);
    thread_proc(0);
    return 0;
}

The program just waits there, doing nothing, which actually means it works OK.

Let's see another example with two classes that refer to each other:

#include "gc.hpp"


using namespace gc;

class Bar;

class Foo;

class Bar : public Object {

public:
    Pointer<Foo> foo;
    ~Bar() {
        cout << "~Bar\n";
    }
};


class Foo : public Object {

public:
    Pointer<Bar> bar;
    ~Foo() {
        cout << "~Foo\n";
    }
};

void test_reference()
{
    Pointer<Foo> foo = new Foo;
    Pointer<Bar> bar = new Bar;
    foo->bar = bar;
    bar->foo = foo;
}

int main()
{
    test_reference();
    collectGarbage();
    getchar();
    return 0;
}

Running the program above produces the following output:

~Foo
~Bar

which means that since no pointer references the objects, the objects have been collected!

Notes

If it does not work for you, don't blame me. I've tried to do my best, but I am sure more testing is needed. I've never actually done a bigger hack in my life!

Update

I re-organized the code so it looks much better, and fixed the initialization order sequence.

Update 20 June 2005

I re-coded the whole thing and now it is almost twice as fast as Java, and memory utilization is much better!

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here