Changing an Object's Polymorphic Behavior at Runtime with the V-table

danielh_code

4.50/5 (14 votes)

1 Feb 2010CPOL8 min read

39.1K

This technique allows you to change the polymorphic behavior of an object at runtime using the v-table.

Introduction

I recently discovered an interesting C++ technique that I've never read about before, so I thought I'd share it here. It isn't a language feature or anything, but it is still interesting and (in my case at least) useful. The technique allows you to change the polymorphic behavior of an object at runtime.

Background

First, a little back story. I've got a Property class that provides generic access to an object's property value. To provide this, the Property class must know the data type of the property that it encapsulates. So, I've also got a DataType class that encapsulates a data type and provides generic access to values of that type. This DataType class uses standard polymorphic class design such that the abstract base DataType class is implemented for each data type that we need to support (i.e., DataType_int or DataType_MyClass). So, my Property class has a reference (pointer) to a DataType object which provides it with generic access to that type's value. This is also an example of the Strategy pattern, which allows for the Property class to change its behavior (its DataType) at runtime, and an example of design by composition (Property has a DataType) rather than inheritance (Property is subclassed for each DataType it must support). So far, I think that I'm on the right path.

The problem arises when I make a couple of DataType subclasses and begin trying to assign them to Property. Since Property has a reference to a DataType object, that object must exist somewhere. So, I have a couple of options. I can have Singleton instances of each DataType subclass and let Property objects reference those Singletons. Or, I can dynamically allocate an instance of a DataType class and let the Property class manage that object's memory. The latter would result in many small allocations, which would be slow and could fragment the heap. So it isn't desirable. And, I prefer not to keep globals around if at all possible, so the Singleton solution, while not terrible, was not ideal.

I started thinking of using a structure of function pointers to encapsulate the many behaviors required to encapsulate a given type. However, I quickly realized that this would result in huge objects when I really only want a single reference to a class of functionality that the group of functions would define. At this point, I realized (as I'm sure you also have) that what I needed was a class. The class provides each instance of it with a group of functions accessed via a single reference, the v-table. Following this train of thought, I began to think of an object as a reference to a group of functions (methods). If I just copied this reference, then I could change the functionality of my object (exactly the way that my Property class can change its functionality by changing its DataType reference). This is the standard Strategy design pattern.

Using the Code

The solution that I arrived at looks like this (I'll explain below):

C++

 #include <cstring> // for memcpy

// Base DataType class
class DataType {
public:

    // Construction
    DataType() {}
    DataType(const DataType &newType) { setType(newType); }
 
    // Set the polymorphic behavior of this DataType object
    void setType(const DataType &newType) {
        memcpy(this, &newType, sizeof(DataType));
    }
 
    // Polymorphic behavior example
    protected: virtual int _getSizeOfType() const { return -1; }
    public: inline int getSizeOfType() const { return _getSizeOfType(); }
 
    // Polymorphic behavior example
    protected: virtual const char *_getTypeName() const { return NULL; }
    public: inline const char *getTypeName() const { return _getTypeName(); }
};
 
// Implementation of DataType for 'int'
class DataType_int : public DataType {
public:

    // Construction
    DataType_int() {}
    DataType_int(const DataType &newType) : DataType(newType) {}
 
    // Polymorphic behavior example
    protected:  virtual int _getSizeOfType() const { return sizeof(int); }
 
    // Polymorphic behavior example
    protected: virtual const char *_getTypeName() const { return "int"; }
};
 
// Implementation of DataType for 'float'
class DataType_float : public DataType {
public:

    // Construction
    DataType_float() {}
    DataType_float(const DataType &newType) : DataType(newType) {}
 
    // Polymorphic behavior example
    protected:  virtual int _getSizeOfType() const { return sizeof(float); }
 
    // Polymorphic behavior example
    protected: virtual const char *_getTypeName() const { return "float"; }
};
 
// Example
DataType myType = DataType_int();
const char *typeName = myType.getTypeName(); // returns "int"
int typeSize = myType.getSizeOfType(); // returns sizeof(int)
 
myType.setType(DataType_float());
typeName = myType.getTypeName(); // returns "float"

As you can see, when we set the type, we are simply using memcpy to make the object's v-table pointer point to the v-table of the object that gets passed in. This changes myType's polymorphic behavior to that of the new type! And, we no longer need pointers or singletons or dynamic memory allocations! We have an object that is the size of a v-table pointer, and that is all! If you prefer a bit of a speedup here, you could just use *((void**)this) = *((void**)&newType; to copy directly, assuming that your DataType class has no members (thanks to Dezhi Zhao for pointing that out in his comments below).

Please keep in mind that this technique is not standards compliant, as the standard doesn't say anything about v-tables or v-ptrs (thank you to all of the commentators below that pointed this out). If a compiler implements virtual methods in such a way that doesn't store lookup information within an object's memory space, this technique will fail completely. However, I have never heard of a C++ compiler that doesn't work this way.

Also, you can see that we can easily change the type of myType at any point during runtime. This allows you the flexibility of having an uninitialized array of DataType objects and initialize them whenever you like later. For the performance minded out there, Dezhi Zhao also pointed out below that this will most likely cause the processor's branch prediction to fail for the getTypeName() call immediately after changing it. This will only happen for the DataType_float version above, however, as the prediction will only fail if the processor has made a prediction already.

One thing that you may have noticed is the use of public proxy methods (getSizeOfType) that call protected virtual methods (_getSizeOfType). We need to do this because the compiler may skip the v-table lookup when it knows the actual type of an object (as opposed to pointers or references, where it doesn't). This is perfectly reasonable, but breaks our setup. Inside the proxies, though, the v-table lookup always happens. And because they are inline, all they really do is make the compiler look up the correct method in the v-table and call that one. Remember, however, that we are not removing the virtual method lookup. This setup will not speed up virtual method calls in any way. In fact, we depend on the compiler looking up our virtual method for this to work.

One important thing to note about this setup is the absence of any member variable in DataType. Since we are doing a memcpy expecting that both objects have the same size (sizeof(DataType)), none of DataType's subclasses may add any member variables. You could add member variables to DataType with no problem, but you are not able to add any member variables to subclasses. Since I didn't need any member variables for DataType, this didn't present a problem for me. However, it is not impossible to add member variables to subclasses. You just need to use memory that was provided in the base class as the memory where your members live. For example:

C++

#include <cstring> // for memcpy

// Base DataType class
class DataType {
public:

    // Construction
    DataType() {}
    DataType(const DataType &newType) { setType(newType); }
 
    // Set the polymorphic behavior of this DataType object
    void setType(const DataType &newType) {
        memcpy(this, &newType, sizeof(DataType));
    }
 
protected:
 
    // Member data
    enum { kMemberDataBufferSize = 256, kMemberDataSize = 0 };
    char memberDataBuffer[kMemberDataBufferSize];
};
 
// My Data Type class
class DataType_MyType : public DataType {
public:

    // My base class
    typedef DataType BASECLASS;
 
    // Construction
    DataType_MyType() {}
    DataType_MyType(const DataType &newType) : DataType(newType) {}
 
    // Access myData
    inline int getExampleMember() const { return _getMemberData().exampleMember; }
    inline void setExampleMember(int newExampleMember) 
        { _getMemberData().exampleMember = newExampleMember; }
 
protected:
 
    // Member Data
    struct SMemberData {
        int exampleMember;
    };
 
    // Amount of member data buffer that we use (this class' member data +
    // all base class' member data)
    enum { kMemberDataSize = sizeof(SMemberData) + BASECLASS::kMemberDataSize };
 
    // Make sure that we don't run out of data buffer
    #define compileTimeAssert(x) typedef char _assert_##__LINE__[ ((x) ? 1 : 0) ];
    compileTimeAssert(kMemberDataSize <= kMemberDataBufferSize);
 
    // Access member data
    inline SMemberData &_getMemberData() {
        return *((SMemberData*) memberDataBuffer);
    }
    inline const SMemberData &_getMemberData() const {
        return *((const SMemberData*) memberDataBuffer);
    }
};

As you can see, the DataType base class simply provides a buffer of data which the subclasses may use to store whatever member data they like. While this setup is a bit messy, it clearly works, and without too many hoops to jump through.

Points of Interest

Finally, an excellent complimentary technique to use with this technique is Type Traits. While implementing my Property class with this new DataType setup, I realized that it was kind of a pain to specify your DataType subclass whenever you register a method or member as a property:

C++

Property propList[] = {
    Property(
        "prop1",
        DataType_Prop1(), &MyClass::getProp1,
        DataType_Prop1(), &MyClass::setProp1
    ),
 
    Property(
        "prop2",
        DataType_Prop2(), &MyClass::getProp2,
        DataType_Prop2(), &MyClass::setProp2
    ),
};

Also, this setup isn't very type-safe, since if I change the return value of MyClass::getProp1, I would get no warnings or errors, and the program would (at best) crash and burn when I use that property. Ideally, you would declare properties like this:

C++

Property propList[] = {
    Property("prop1", &MyClass::getProp1, &MyClass::setProp1),
    Property("prop2", &MyClass::getProp2, &MyClass::setProp2),
};

The data type would be pulled from the method declaration and converted into the appropriate DataType subclass. Luckily for me, my Property constructor already looked like this:

C++

template <class Class, typename AccessorReturnType, typename MutatorArgType>
Property(
    const char *propertyName,
    const DataType &accessorDataType, AccessorReturnType (Class::*accessor)(),
    const DataType &mutatorDataType, void (Class::*mutator)(MutatorArgType)
) {
    set(propertyName, accessorDataType, accessor, mutatorDataType, mutator);
}

So, I already had the data types that I would need: AccessorReturnType and MutatorArgType. All I needed to do was have some mechanism to convert those compile-time C++ types into run-time DataType subclass objects. This is actually easy to do with a template trick called Template Specialization. I won't describe it here, but if you don't already know what it does, feel free to check out the link and come back. It is really powerful.

The basic idea is to have a templated class that is unimplemented, or implemented for the general case. Then, for each special case, we partially or completely specialize our template arguments and implement that as a new class, like this:

C++

// General case is not implemented. 
// If you give this template a type that isn't supported,
// you'll get a compiler error
template <typename CppType> struct MapCppTypeToDataType;
 
// Macro to define a template specialization that maps the given CppType to the given
// DataType. Once mapped, you can access the DataType like so:
//    MapCppTypeToDataType<int>::Type
// This should resolve to whatever type you mapped to int (DataType_int, for example).
#define MAP_DATA_TYPE(CppType, MappedDataType) \
    template <> struct MapCppTypeToDataType<CppType> { \
        typedef MappedDataType Type; \
    }
 
// Function to convert a C++ type to a DataType object
template <typename CppType>
inline DataType GetDataType() {
    return MapCppTypeToDataType<CppType>::Type();
}
 
// Example
MAP_DATA_TYPE(int, DataType_int);
DataType myDataType = GetDataType<int>(); // returns a DataType_int

You can see how powerful this is. Now, we can add a new Property constructor that computes the correct DataType object for you:

C++

template <class Class, typename AccessorReturnType, typename MutatorArgType>
Property(
    const char *propertyName,
    AccessorReturnType (Class::*accessor)(),
    void (Class::*mutator)(MutatorArgType)
) {
    set(
        propertyName,
        GetDataType<AccessorReturnType>(), accessor,
        GetDataType<MutatorArgType>(), mutator
    );
}

This constructor allows you to declare properties as we would prefer to, like this:

C++

Property propList[] = {
    Property("prop1", &MyClass::getProp1, &MyClass::setProp1),
    Property("prop2", &MyClass::getProp2, &MyClass::setProp2),
};

You can easily see how much easier and safer this constructor is than the old one. You no longer have to know the type of the method you are registering. The compiler, which already knows it, can simply do the work for you. And this method is safer, because if you change the return type of prop1 now, the compiler will simply change the DataType that gets used. And, if there isn't a DataType that supports the new return type, your compiler will give you an error, something along the lines of "Type was not declared in class 'MapCppTypeToDataType' with template parameters ...".

I hope that you've enjoyed reading about this technique. If you have any comments or questions, I'd love to hear them. Thanks for reading!

P.S.: I'm not sure that the code snippets above compile. They were meant only for illustration, not for compilation. However, if you find any errors, please let me know and I'll correct them.

History

January 24, 2010 - Original article.
February 01, 2010 - Fixed a few code errors.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)