A new way to implement Delegate in C++

Quynh Nguyen

4.88/5 (28 votes)

25 May 2007CPOL11 min read

918

Solving issues with some current implementations of Delegate in C++

Download source & demo project - 9.7 KB

Index

Introduction
Background
Under the hood
Using the code
A comparison of performance
Point of interest
What's next?
- Multicast Delegate
- More portable
History

Introduction

Delegate in C++ is not a new concept. There are a lot of implementations on Code Project and the Internet in general. In my opinion, the most comprehensive article is by Don Clugston in which Don shows all of the difficulties that people are facing with pointer-to-method. On the other hand, Sergey Ryazanov introduces a very simple solution that utilizes the "non-type template parameter" feature of modern C++ language. With Boost users, Boost.Function should be the most famous one. Here is a comparison of these implementations.

	Boost.Function	Sergey Ryazanov	Don Clugston
Characteristics	Use dynamic memory to store information when binding to method and bound object.	Make use of a non-type template parameter. Very easy to understand.	Due to the fact that each compiler stores pointer-to-method in different ways, Don defines a uniform format for his delegate. Then depending on each compiler, he converts from compiler's format to his own format.
Memory footprint	Not efficient due to expensive heap memory allocation.	Efficient	Efficient
Standard compliant	Yes	Yes	No. Actually, it's a hack.
Portable	Yes	No. Not all C++ compilers support non-type template argument.	Yes. But just at this moment for all C++ compilers that he knew and not sure in future.
Syntax	Nice	Not nice, for example: `SomeObject obj;` `delegate d = delegate::from_member<SomeObject, &SomeObject::someMethod>(&obj);`	Nice
Is delegate comparable?	Yes	No. Actually there is a way to add this feature into Sergey's implementation.	Yes
KISS?	The answer is YES if you feel comfortable in C++ template programming. Otherwise, the answer may be NO.	Yes. It is understandable even if readers are just beginners in template programming.	No. You are required to have a deep knowledge of pointers and compilers to be able to understand his code. Don has provided a comprehensive tutorial in his article.

Note: Some people did a comparison of the speeds of invocation between these listed Delegates. However, in my opinion, the difference of several hundreds of milliseconds for 10,000,000 Delegate invocations is not very significant.

An ideal delegate should be standard-compliant and portable (Boost), have efficient memory usage (Sergey & Don), nice syntax (Boost & Don), comparable (Boost & Don) and KISS (Sergey). Could we improve one of them to be the ideal one, or is there a new one? Yes, I'm going to show you a new one.

Background

Different kinds of class

C++

// kind_of_class.cpp
// This file is to demo about different kinds of pointer to members

class dummy_base1 { };
class dummy_base2 { };

class dummy_s : dummy_base1 { };
// Reach to here, the compiler will recognize dummy_s is a 
// kind of "single inheritance".
typedef void (dummy_s::*pointer_to_dummy_s)(void);
size_t size_of_single = sizeof(pointer_to_dummy_s);

class dummy_m : dummy_base1, dummy_base2 { };
// Reach to here, the compiler will recognize dummy_m is a 
// kind of "multiple inheritance".
typedef void (dummy_m::*pointer_to_dummy_m)(void);
size_t size_of_multi = sizeof(pointer_to_dummy_m);

class dummy_v : virtual dummy_base1 { };
// Reach to here, the compiler will recognize dummy_v is a 
// kind of "virtual inheritance".
typedef void (dummy_v::*pointer_to_dummy_v)(void);
size_t size_of_virtual = sizeof(pointer_to_dummy_v);

class dummy_u;
// forward reference, unknown at this time
typedef void (dummy_u::*pointer_to_dummy_u)(void);
size_t size_of_unknown = sizeof(pointer_to_dummy_u);

void main()
{
    printf("%d\n%d\n%d\n%d", size_of_single, size_of_multi, 
        size_of_virtual, size_of_unknown);
}

If you compile and run the demo with VC++ using default project settings, the output will be:

C++

The output of the above example will not be identical when compiling with other C++ compilers. Refer to the section entitled "Implementations of Member Function Pointers" in Don's article for more information. Obviously, the pointer-to-method of an "unknown" class is always the biggest one in size, excepting compilers. From now on in this article, we distinguish a C++ class into 2 categories:

Known class (single, multiple, virtual)
Unknown class (forward declaration)

Different kinds of methods of the same class

The following are some attributes (modifiers) that affect the method of a class:

Calling convention: __cdecl, __stdcall, __fastcall, __thiscall. Please refer to Argument Passing and Naming Conventions for more information. As far as I know, G++ allows but ignores calling conventions on method declarations
Number and type of arguments
Return type
Const vs. non-const methods
Virtual vs. non-virtual methods

Fortunately, none of these attributes (modifiers) affects the size of the pointer-to-method. Assume that CKnownClass is a known class which is already described somewhere; here is an example:

C++

// __thiscall, non-const, 1 argument
typedef void (CKnownClass::*method_type_1)(int );

// __cdecl, non-const, 2 argument, return CString
typedef CString (__cdecl CKnownClass:: method_type_2)(int, CString );

// __fastcall, const, 3 argument, return void*
typedef void* (__fastcall CKnownClass::* method_type_3)(
    int, CString, void*) const;

void main()
{
    printf("%d\n%d\n%d", sizeof(method_type_1),
        sizeof(method_type_2),
        sizeof(method_type_3));
}

Different kinds of free functions or static methods

The following are some attributes (modifiers) that affect a free function or static method:

Calling convention: __cdecl, __stdcall, __fastcall
Number and type of arguments
Return type

Similarly, attributes have no effect on the size of the pointer to free functions. The size should be 4 bytes on all 32-bit platforms and should be 8 on all 64-bit platforms.

Under the hood

This section shows how this library is implemented, step by step.

Data structure of delegate

The data structure of Delegate includes the following information:

Number and type of arguments
Return type
Method or free function
Calling convention
Address of method or function that the delegate points to

Note: We don't need to store information about "const" and "virtual" attributes.

To answer points 1 and 2, this Delegate will be implemented as a C++ template, just like in other implementations. For example:

C++

// Boost
typedef boost::function2<void, int, char*> BoostDelegate;

// Don Clugston
typedef fastdelegate::FastDelegate2<int, char*, void> DonDelegate;

// Sergey Ryazanov
typedef srutil::delegate2<void, int, char*> SRDelegate;

// And this implementation
typedef sophia::delegate2<void, int, char*> SophiaDelegate;

The following code snippet will answer questions 3, 4 and 5:

C++

class delegate2 // <void (int, char*)>
{
protected:
    class _never_exist_class;

    typedef void (_never_exist_class::*thiscall_method)(int, char*);
    typedef void (__cdecl _never_exist_class::*cdecl_method)(int, char*);
    typedef void (__stdcall _never_exist_class::*stdcall_method)(int, char*);
    typedef void (__fastcall _never_exist_class::*fastcall_method)(int, char*);
    typedef void (__cdecl *cdecl_function)(int, char*);
    typedef void (__stdcall *stdcall_function)(int, char*);
    typedef void (__fastcall *fastcall_function)(int, char*);

    enum delegate_type
    {
        thiscall_method_type,
        cdecl_method_type,
        stdcall_method_type,
        fastcall_method_type,
        cdecl_function_type,
        stdcall_function_type,
        fastcall_function_type,
    };

    class greatest_pointer_type
    {
        char never_use[sizeof(thiscall_method)];
    };

    delegate_type m_type;
    _never_exist_class* m_p;
    greatest_pointer_type m_fn;

public:
    void operator()(int i, char* s)
    {
        switch(m_type)
        {
        case thiscall_method_type:
            return (m_p->*(*(thiscall_method*)(&m_fn)))(i, s);
        case cdecl_function_type:
            return (*(*(cdecl_function*)(&m_fn)))(i, s);
        default:
            // This is just a demo, don't implement for all cases
            throw;
        }
    }

    static int compare(const delegate2& _left, const delegate2& _right)
    {
        // first, compare pointer
        int result = memcmp(&_left.m_fn, &_right.m_fn, sizeof(_left.m_fn));
        if(0 == result)
        {
            // second, compare object
            result = ((char*)_left.m_p) - ((char*)_right.m_p);
        }
        return result;
    }

    // constructor from __cdecl function
    delegate2(void (__cdecl *fn)(int, char*))
    {
        m_type = cdecl_function_type;
        m_p = 0;
        reinterpret_cast<cdecl_function_type&>(m_fn) = fn;
        // fill redundant bytes by ZERO for later comparison
        memset((char*)(&m_fn) + sizeof(fn), 0, sizeof(m_fn) - sizeof(fn));
    }

    // constructor from __thiscall method
    template<class T> delegate2(T* p, void (T::*fn)(int, char*))
    {
        m_type = thiscall_method_type;
        m_p = reinterpret_cast<_never_exist_class*>(p);

        ///////////////////////////////////////////////////////////
        // WE WANT TO DO THE FOLLWOING ASSIGNMENT
        // m_fn = fn
        // BUT HOW TO DO IT IN A STANDARD COMPLIANT AND PORTABLE WAY?
        // FOLLOW IS THE ANSWER
        ///////////////////////////////////////////////////////////

        // forward reference
        class _another_never_exist_class_;
        typedef void (
            _another_never_exist_class_::*large_pointer_to_method)(
            int, char*);
            
        COMPILE_TIME_ASSERT(sizeof(
            large_pointer_to_method)==sizeof(greatest_pointer_type ));

        // Now tell compiler that '_another_never_exist_class_' 
        // is just a 'T' class
        class _another_never_exist_class_ : public T {};
        
        reinterpret_cast<large_pointer_to_method&>(m_fn) = fn;

        // Double checking to make sure the compiler doesn't change its 
        // mind :-)
        COMPILE_TIME_ASSERT(
            sizeof(large_pointer_to_method)==sizeof(greatest_pointer_type ));
    }
};

As you have just seen, we force the compiler to convert the pointer-to-method of a known class to a pointer-to-method of an unknown class. In other words, we convert a pointer-to-method from its smallest format to its largest format. In this way, we have a unified format for all kinds of pointers to functions/methods. As a result, comparison between Delegate instances is easy. It's simply a call to the standard memcmp C function.

Make Delegate faster and more extensible

There are 2 problems with the above design:

First, the "switch... case" statement makes this implementation run a little slower than others.
Second, if we want to extend the delegate to have more features -- i.e. support for reference counter mechanisms like smart pointer or COM interface -- we need more storage for that information.

Polymorphism might be an answer. However, the "one-size-fits-all" characteristic of Delegate is the main reason for its existence. Due to that fact, all methods or operators of Delegate class templates must be made as NON-virtual. At this point, many will remember the so-called "Strategy Design Pattern." Yes, it is also my choice. However, there are still things that need to be considered:

Using the "Strategy Design Pattern" introduces a little overhead when invoking Delegate: the user application passes parameters to the delegate; the delegate passes parameters to its strategy and the strategy again passes parameters to the real method or function. However, if arguments are all simple types such as char, long, int, pointer, and reference then the compiler will automatically generate optimizing code that removes such overhead.
Who should hold data: Strategy or Delegate? Data here means the pointer-to-object (_never_exist_class* m_p) and pointer-to-address of the method or function (greatest_pointer_type m_fn). If Delegate holds data, it must pass data to Strategy. Such operations suppress the compiler from optimizing the code. If Strategy holds data, the Strategy object must be created dynamically. This involves expensive memory allocations (new, delete operation).

The two problems are resolved if we apply the Strategy Design Pattern with a little modification:

To allow the compiler to optimize the code, we put data into Strategy. Note: Putting data into Strategy causes it to look like the Bridge Design pattern, but this is not important.
To avoid dynamic memory allocation, we will embed the whole Strategy object into the Delegate object instead of keeping a pointer to it as usual

How are Strategies implemented?

The real implementation makes use of templates. To let readers easily catch up, following code is written for reference purposes:

C++

class delegate_strategy // <void (int, char*)>
{
protected:
    class _never_exist_class;

    typedef void (_never_exist_class::*thiscall_method)(int, char*);
    typedef void (__cdecl _never_exist_class::*cdecl_method)(int, char*);
    typedef void (__stdcall _never_exist_class::*stdcall_method)(int, char*);
    typedef void (__fastcall _never_exist_class::*fastcall_method)(int, char*);
    typedef void (__cdecl *cdecl_function)(int, char*);
    typedef void (__stdcall *stdcall_function)(int, char*);
    typedef void (__fastcall *fastcall_function)(int, char*);

    class greatest_pointer_type
    {
        char never_use[sizeof(thiscall_method)];
    };

    _never_exist_class* m_p;
    greatest_pointer_type m_fn;

public:

    // pure virtual function
    virtual void operator()(int, char*) const
    {
        throw exception();
    }
};

class delegate_cdecl_function_strategy : public delegate_strategy
{
    // concrete strategy
    virtual void operator()(int i, char* s) const
    {
        return (*(*(cdecl_function*)(&m_fn)))(i, s);
    }

public:

    // constructor
    delegate_cdecl_function_strategy(void (__cdecl *fn)(int, char*))
    {
        m_p = 0;
        reinterpret_cast<cdecl_function_type&>(m_fn) = fn;
        // fill redundant bytes by ZERO for later comparison
        memset((char*)(&m_fn) + sizeof(fn), 0, sizeof(m_fn) - sizeof(fn));
    }
};

class delegate_thiscall_method_strategy : public delegate_strategy
{
    // concrete strategy
    virtual void operator()(int i, char* s) const
    {
        return (m_p->*(*(thiscall_method*)(&m_fn)))(i, s);
    }

public:

    // constructor
    template<class T> delegate_thiscall_method_strategy(
        T* p, void (T::*fn)(int, char*))
    {
        m_p = reinterpret_cast<_never_exist_class*>(p);

        ///////////////////////////////////////////////////////////
        // WE WANT TO DO THE FOLLWOING ASSIGNMENT
        // m_fn = fn
        // BUT HOW TO DO IT IN A STANDARD COMPLIANT AND PORTABLE WAY?
        // FOLLOW IS THE ANSWER
        ///////////////////////////////////////////////////////////

        // forward reference
        class _another_never_exist_class_;
        typedef void (
           _another_never_exist_class_::*large_pointer_to_method)(int, char*);
            
        COMPILE_TIME_ASSERT(sizeof(
            large_pointer_to_method)==sizeof(greatest_pointer_type ));

        // Now tell compiler that '_another_never_exist_class_' 
        // is just a 'T' class
        class _another_never_exist_class_ : public T {};
        
        reinterpret_cast<large_pointer_to_method&>(m_fn) = fn;

        // Double checking to make sure the compiler doesn't change its
        // mind :-)
        COMPILE_TIME_ASSERT(sizeof(
            large_pointer_to_method)==sizeof(greatest_pointer_type ));
    }
};

class delegate2 // <void (int, char*)>
{
protected:
    char m_strategy[sizeof(delegate_strategy)];

    const delegate_strategy& strategy() const
    {
        return *reinterpret_cast(&m_strategy);
    }

public:
    // constructor for __cdecl function
    delegate2(void (__cdecl *fn)(int, char*))
    {
        new (&m_strategy) delegate_cdecl_function_strategy(fn);
    }

    // constructor
    template<class T>
        delegate2(T* p, void (T::*fn)(int, char*))
    {
        new (&m_strategy) delegate_thiscall_method_strategy(p, fn);
    }

    // Syntax 01: (*delegate)(param...)
    delegate_strategy const& operator*() const throw()
    {
        return strategy();
    }

    // Syntax 02: delegate(param...)
    // Note: syntax 02 might be slower than syntax 01 in some cases
    void operator()(int i, char* s) const
    {
        return strategy()(i, s);
    }
};

Support object life-time management

When binding an object and its method to a delegate instance, the delegate normally just keeps the address of the object and the address of the method for later invocation. There are 2 possible problems that may cause our application to crash during runtime:

What happens if the method is inside a DLL but that DLL is already unloaded out of process' space? We have no way to deal with such situations so we simply ignore this problem.
What happens if the object is deleted somehow, somewhere due to a developer's mistake? The simple answer is: Developers must take care by themselves to avoid this mistake. However, manual object management is always tedious, error-proven and slows down developers' performance. So I have tried to find a simple but good enough mechanism for this purpose. The following is such a one:

Boost introduces the Clonable & Clone Allocator concepts. Although it's not flexible for many purposes, its simplicity will not make this Delegate library complicated. For that reason, this library makes use of the concepts in Boost and exposes the following handy Clone Allocator classes.

Class <a href="http://www.boost.org/libs/ptr_container/doc/reference.html#class-view-clone-allocator">view_clone_allocator</a> is an allocator that does nothing. It's identical to the one with the same name in Boost. When creating a delegate instance, if we don't give a specified allocator, this one will be used by default.
Class <a href="http://www.boost.org/libs/ptr_container/doc/reference.html#class-heap-clone-allocator">heap_clone_allocator</a> which is also identical to the one with the same name in Boost. It uses dynamic memory allocation and copy constructor to clone the bound object.
Class com_autoref_clone_allocator is provided to support COM interface. It should also work for any class objects that implement the two methods AddRef & Release in their right meaning.

One rule should be remembered when doing assignments between 2 Delegate instances: the target Delegate instance would use the clone allocator of the source to clone object. Logic of assignment is as follows:

At first, the target delegate (left side) frees its object using its current clone allocator.
All information of the source (right side) would be copied to the target including clone allocator. Actually, this is a simple bit-wise copy.
The target would clone the new object that it's holding using the new clone allocator.
And so on for later assignments between delegate instances.

Note: Actually, in real implementation I already considered and eliminated the problem of self-assignment, in which source and target are identical.

In some cases, we want to bind an already-cloned object to a delegate instance. If so, we want the delegate to free the object automatically, but not clone it again. In order to achieve this purpose, when binding an object and its method to a delegate, we have to provide 2 additional pieces of information: the clone allocator class is the first one; the second is a Boolean value to tell whether the delegate should clone the object or not.

C++

delegate.bind(
    &object, &TheClass::a_method, 
    clone_option< heap_clone_allocator >(true));

Relaxed Delegates

With non-relaxed delegate libraries, template parameter types passed over to the delegate are very strictly checked. For example, if we assign a function with prototype int (*)(long) to a delegate with prototype long (*)(int), the compiler will raise errors saying that the assignment is not allowed since int and long are of different types. Actually, such conversion is safe because it meets the following three conditions:

The number of arguments matches.
Each matching argument can be implicitly converted from the delegate's argument to the target function's argument by the compiler.
The return-type can be implicitly converted from the target function's return-type to the delegate's return-type by the compiler. The void return-type is a special case: we can bind a delegate that returns void to any method or function that satisfies the two above conditions. This is the same as when we call a function/method, but don't care about its returned value.

Using the code

The following code snippet demonstrates the usage of this library:

C++

using namespace sophia;

// Base class with a virtual method
struct BaseClass
{
   virtual int virtual_method(int param) const
   {
       printf("We are in BaseClass: (param = %d)\n", param);
       return param;
   }

   char relaxed_method(long param)
   {
       printf("We are in relaxed_method: (param = %d)\n", param);
       return 0;
   }
};

// A virtual-inheritance class
struct DerivedClass : public virtual BaseClass {
   virtual int virtual_method(int param) const
   {
       printf("We are in DerivedClass: (param = %d)\n", param);
       return param;
   }
};

void Test()
{
   // Assuming we have some objects
   DerivedClass object;

   // Delegate declaration
   typedef sophia::delegate0<DWORD> MyDelegate0;
   typedef sophia::delegate1<int, int> MyDelegate1;
   typedef sophia::delegate4<void, int, int, long, char> AnotherDelegateType;

   // Determine size of a delegate instance
   printf("sizeof(delegate) = %d\n", sizeof(AnotherDelegateType));

   // Constructor
   MyDelegate0 d0(&GetCurrentThreadId);
   MyDelegate1 d1(&object, &DerivedClass::virtual_method);
   MyDelegate1 d2; // null delegate
   AnotherDelegateType dNull;

   // Compare between delegates even if they are different types
   assert(d2 == dNull);

   // Bind to a free function or a method
   d0.bind(&GetCurrentThreadId);
   d0 = &GetCurrentThreadId;
   d2.bind(&object, &DerivedClass::virtual_method);

   // Compare again after binding
   assert(d2 == d1);

   // Clear a delegate
   d2 = NULL; // or
   d2.clear();

   // Invoke with syntax 01
   d1(1000);

   // Invoke with syntax 02
   // This syntax is faster than syntax 01
   (*d1)(10000);

   // RELAXED delegate
   d1.bind(&object, &DerivedClass::relaxed_method);
   (*d1)(10000);

   // Swap between two delegates
   d2.swap(d1); // now d1 == NULL

   // Execute a null/empty delegate
   assert(d1.empty());
   try
   {
       d1(100);
   }
   catch(sophia::bad_function_call& e)
   {
       printf("Exception: %s\n    Try again: ", e.what());
       d2(0);
   }

   // Object life-time management
   // Case 1: we want the object is cloned
   d1.bind(&object, &DerivedClass::virtual_method,
       clone_option<heap_clone_allocator>(true));

   // Object life-time management
   // Case 2: we DO NOT want the object is cloned when binding
   for(int i=0; i<100; ++i)
   {
       DerivedClass* pObject = new DerivedClass();
       d1.bind(pObject, &DerivedClass::virtual_method,
           clone_option<heap_clone_allocator>(false));
       d1(100);
   }
}

A comparison of performance

All Delegates, including this one, add 2 additional layers before invoking a real method or free function.

First, parameters are passed to Delegate.
Second, parameters are passed to Stub (Sergey), Invoker (Boost, Don) or Strategy (this library).
Third, parameters are passed to a real method or function.

This Delegate exposes a way to reduce one layer:

Syntax 01 uses 2 additional layers, the same as other ones
D(param1, param2)
Syntax 02 uses a de-reference operator to call the Strategy object directly<br />(*D)(param1, param2)

If we pass simple data types (char, int, pointer…) to any one of the listed Delegate's implementations, the compiler will generate optimizing code. So the speeds of invocation are not much different between them. However, if we pass complex data types that involve a constructor and destructor, the 2nd syntax is faster than any other.

Points of interest

This article shows a way of converting various formats of pointer-to-method/free-function into a uniform format; the conversion is achieved in a standard compliant way. As a result, comparison between delegate instances is just easy.
The Strategy Design Pattern with a little modification makes this Delegate fast and extensible.
It is a kind of KISS method: people with a limited background in C++ template programming can understand the source code.

What's next?

Multicast Delegate

What we discussed in this article is about the Singlecast Delegate. There is another kind, the so-called Multicast Delegate, which is internally just a collection of other delegates. When we make an invocation on Multicast, all Singlecast delegates in the collection are also invoked one-by-one. Mainly, Multicast is used to implement the Observer Design Pattern.

More portable

Here is the list of compilers that the library has been tested with:

VC++ 7.1
VC++ 2005
Mingw32-gcc-3.4.5

Due to the fact that not all C++ compilers are compliant with the C++ standard, please keep me informed if this is a case with your compiler.

History

May 23, 2007: Supports relaxed delegates.
May 19, 2007: Initial version.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)