Thunk and its uses

John_Tang

4.82/5 (19 votes)

18 Jul 2008CPOL8 min read

430

An introduction to thunk and its uses in callback handling, interface marshaling, and supporting multiple inheritance in C++.

Download source code - 12.9 KB

Introduction

Thunk is a very useful technology. I will talk about three typical uses of thunks in this article:

Turning callbacks into member functions of classes.
Providing an interface proxy.
Supporting virtual functions under multiple inheritance in C++.

Before we start, let's have a general idea of what a thunk is. Thunk is generally a piece of machine code that intercepts a client call and modifies the call stack before jumping to the real implementation of the client call.

Turning callbacks into member functions of classes

Libraries often require callbacks. The problem with callbacks is that they need to be implemented as global or static functions, which may be inconvenient in an OO development environment. For instance, a Win32 program requires us to write a WNDPROC callback function in which there is usually a big switch/case block, and what is worse is that we also need to define some static variables inside the WNDPROC to keep track of states between calls. If we could turn callbacks into member functions of classes, then we would be able to use member functions instead of a big switch/case block, and further, we would be able to use member variables instead of static variables of functions to keep track of states.

A thunk can do this magic. The problem of callbacks is that they do not have the this pointer, so the main job of a thunk is to add a this pointer to the call stack and then call the callback. Once inside the callback, we can fetch the this pointer from the call stack and call member functions using the this pointer. But wait, here the callback is called by the thunk, not by any library, so we still need to provide the library with an address which will be called whenever the library calls the callback. And, this address is the address of our thunk. So, the procedure can be summarized as:

The library calls the thunk.
The thunk adds a this pointer to the call stack.
The thunk forwards the call to the actual callback.
The callback fetches the this pointer from the call stack and calls member functions using the this pointer.

This technology is used in the implementation of ATL's CWindowImpl, so I just stole the code from ATL and modified/simplified it. By the way, you don't have to have any knowledge of ATL to read this sample. But, if you are already familiar with ATL, you can skip this sample.

Using the code: The WindowWithThunk project

The code relating to step 1 is as follows:

C++

LRESULT CALLBACK StartWindowProc(HWND hWnd, UINT uMsg, WPARAM wParam, LPARAM lParam)
{
    // …
    // Use the address of the thunk as the address of the callback, so that
    // whenever the library calls the callback, it ends up calling the thunk.
    WNDPROC pProc = (WNDPROC)pThis->m_pThunk;
    ::SetWindowLong(hWnd, GWL_WNDPROC, (LONG)pProc);
    // …
}

The code relating to step 2 is as follows:

C++

LRESULT CALLBACK StartWindowProc(HWND hWnd, UINT uMsg, WPARAM wParam, LPARAM lParam)
{
    CWindowWithThunk* pThis = (CWindowWithThunk*)g_ModuleData.ExtractWindowObj();
    // …
    pThis->m_pThunk->Init((DWORD)TurnCallbackIntoMember, pThis);
    // …
}

void _stdcallthunk::Init(DWORD proc, void* pThis)
{
    // 0x042444C7 is the same as "mov dword ptr[esp+0x4]," on the x86 platform,
    // so the following statements are the same as "mov dword ptr [esp+0x4], pThis"
    // where [esp+0x4] is the hWnd argument that is pushed onto the call stack 
    // by the Windows. Here the this pointer overwrites the hWnd, but there is no harm 
    // because the hWnd has already been saved to the object to which the this 
    // pointer refers to. See figure 1.

    m_mov = 0x042444C7;  //C7 44 24 0C
    m_this = PtrToUlong(pThis);
    // …
}

Figure 1: The left side shows the original call stack, the right side shows the call stack after "mov dword ptr[esp+0x4], pThis"

The code relating to step 3 is as follows:

C++

LRESULT CALLBACK StartWindowProc(HWND hWnd, UINT uMsg, WPARAM wParam, LPARAM lParam)
{
    // …
    pThis->m_pThunk->Init((DWORD)TurnCallbackIntoMember, pThis);
    // …
}

void _stdcallthunk::Init(DWORD_PTR proc, void* pThis)
{
    // After the this pointer has been added to the call stack, now jump to the 
    // actual callback (in this case, TurnCallbackIntoMember)
    m_jmp = 0xe9;
    m_relproc = DWORD((INT_PTR)proc - ((INT_PTR)this+sizeof(_stdcallthunk)));
}

The code relating to the final step is as follows:

C++

LRESULT CALLBACK TurnCallbackIntoMember(HWND hWnd, UINT message, 
                 WPARAM wParam, LPARAM lParam)
{
    // Now fetch the this pointer from the call stack
    CWindowWithThunk* pThis = (CWindowWithThunk*)hWnd;
    // and call member functions using the this pointer
    pThis->OnPaint();
}

Providing an interface proxy

In C++, an interface is a collection of method declarations without implementations. An interface pointer is a pointer to a vptr, which in turn points to an array of functions that implement those methods declared inside the interface. An interface proxy (hereinafter referred to as proxy) is the same as an interface as far as the client using the interface is concerned. When a client calls methods using a proxy pointer, the client ends up calling the implementation of the proxy. The implementation of a proxy can do anything it wants, such as fetching the arguments that are pushed onto the call stack by the client and then forwarding them to the real implementation of the interface. For instance, in COM Marshaling, or any other RPC environment, when a client requests an interface pointer from COM, COM just returns a proxy pointer to the client. The client then calls methods using the proxy pointer, this ends up calling the implementation of the proxy. The implementation of the proxy then fetches the arguments from the call stack, packs them, and sends them to a remote machine or another apartment where the real implementation of the interface is called. Now, the question is how we should write proxies for interfaces. One answer would be that we write a separate proxy for each interface. But this is tedious. One better solution would be to have a single proxy for all interfaces. In COM, there is such a Universal Marshaler (or Type Library Marshaler) which can provide a single proxy for all interfaces with the help of type libraries.

One single proxy for all interfaces means we use one single proxy implementation (method definition) to handle all method calls from all interfaces made by the client. So, this single proxy implementation should know which method of which interface is being called by the client, because only by knowing this can we determine what arguments to expect on the call stack. When a client requests an interface pointer using an IID (interface ID), the client is requesting a pointer to a vptr, which in turn points to a vtable. So, we can create a vtable and associate it with the IID (provided by the client) and method indexes. A method index is the index of the method within an interface, and it is also the index of the method within the vtable. We can know the index of a method if we know the total number of methods in an interface, and we can know the total number of methods in an interface by querying a type library using the IID. But, a vtable is just an array of DWORDs on the x86 platform, so we can't simply fill the vtable with both the IID and method indexes. Here, we can use thunks again. We prepare a thunk for each method, initialize each thunk with the IID and a method index, then fill the vtable with the address of each thunk. The main job of the thunk is to push both the IID and the method index onto the call stack, and then forward the call to the single proxy implementation. The single proxy implementation can now determine which method of which interface is being called using the IID and the method index. The procedure can be summarized as follows:

The client requests an interface pointer using an IID.
We initialize each thunk with the IID and a method index; fill the vtable with the address of each thunk. See figure 2.
Return the proxy pointer (a pointer to the vptr which points to the vtable created in step 2) to the client.
The client calls a method using the proxy pointer (the client ends up calling a thunk initialized in step 2).
The thunk pushes both the IID and the method index onto the stack and calls the single proxy implementation. See figure 3.
The single proxy implementation determines what arguments to expect by querying a type library using the IID and the method index. Now, the single proxy implementation can do whatever it wants.

Figure 2: The relationship between a vtable and its associated thunks. Here the interface ID is 1234.

Figure 3: The left side shows the original call stack, the right side shows the call stack after the IID and the method index are pushed.

Using the code: The UniversalProxy project

The code relating to step 1 is as follows:

C++

int _tmain(int argc, _TCHAR* argv[])
{
    IInterface_Zero* pI0;
    // The client requests an interface pointer using an IID of 0
    ProxyProvider(0, (void**)&pI0);
    // …
}

The code relating to steps 2 and 3 is as follows:

C++

void ProxyProvider(DWORD iid, void** ppv)
{
    // Query the type library for the total number of methods within the interface
    // using iid
    DWORD methods = FakeTypeLibrary::GetNumOfMethods(iid);
    DWORD** vptr = new DWORD*;
    DWORD* vtable = new DWORD[methods];
    for(DWORD midx = 0; midx < methods; ++midx)
    {
        thunk* pThunk = new thunk();
        // The this pointer occupies 4 bytes
        WORD bytes_to_pop = FakeTypeLibrary::GetAugumentStackSize(iid, midx) + 4; 

        // Initialize the thunk with IID and method index
        pThunk->Init(iid, midx, bytes_to_pop);

        // Fill the vtable with the address of each thunk
        vtable[midx] = (DWORD)pThunk;
    }
    (*vptr) = vtable;
    *ppv = vptr;
}

The code relating to step 4 is as follows:

C++

int _tmain(int argc, _TCHAR* argv[])
{
    // …
    pI0->DoSomething(3,'a');
    // …
}

The code relating to step 5 is as follows:

C++

void Init(DWORD iid, DWORD midx, WORD bytes)
{
    // push iid (interface id)
    // push midx (method index)
    // call ProxyImplementation
    // add esp, 8 (pop iid and midx)
    // retn bytes_to_pop (return and pop the normal arguments of the method)
    push_interface = 0x68;
    interface_id = iid;
    push_method = 0x68;
    method_idx = midx;
    call = 0xe8;
    func_offset = (DWORD)&ProxyImplementation - (DWORD)&add_esp;
    // …
}

The code relating to the final step is as follows:

C++

static void _cdecl ProxyImplementation(DWORD midx, DWORD iid, DWORD client_site_addr, 
                   void* pThis /*, arg0, ..., argn_1*/)
{
    // …
    // Use iid and midx to determine what arguments should be on the stack
    // In fact we should determine the arguments by querying the type library,
    // I hard-code here for the purpose of simplification.
    if(iid == 0)
    {
        if(midx == 0)
        {
            // Fetch the arguments from the stack
            BYTE* arg_addr = (BYTE*)&pThis + 4;
            int arg0 = *(int*)arg_addr;
            arg_addr += 4;
            char arg1 = *(char*)arg_addr;

            // Call the real implementation of the interface
            pRealImpl_Zero->DoSomething(arg0,arg1);
        }
    }
    //…
}

Supporting virtual functions under multiple inheritance in C++

Consider the following code:

C++

class Base1
{
public:
    virtual ~Base1(){}
private:
    int Base1Data;
};

class Base2
{
public:
    virtual ~Base2()
    {
        cout << this->Base2Data;
    }
private:
    int Base2Data;
};

class Derived : public Base1, public Base2
{
public:
    virtual ~Derived()
    {
        cout << this->DerivedData;
    }
private:
    int DerivedData;
};

void DeleteObj(Base2* pObj)
{
    delete pObj;
}

int main()
{
    Base2* pB2 = new Derived();
    DeleteObj(pB2);
    return 0;
}

The Base2 pointer pB2 is assigned the address of a Derived object. But, the address of the new Derived object must be adjusted to address its Base2 subobject before it can be saved to pB2. The code to do this is generated by the compiler:

C++

Derived* temp = new Derived;
Base2 *pB2 = temp? temp+sizeof(Base1) : 0;

Now, let's take a look at the statement "delete pObj" inside the DeleteObj() function. At this point, the compiler has no idea what object pObj points to. If pObj points to a Base2 object, pObj (as the this pointer) should be pushed onto the call stack and Base2::~Base2() should be called. If pObj points to a Derived object, pObj should be readjusted to address the beginning of the complete Derived object before it is pushed onto the call stack and Derived::~Derived() is called. But, because the compiler does not know what object pObj points to, it cannot determine whether to readjust pObj or not. So, this decision and readjustment can only be made at runtime.

Here, thunks can help again. We can create a thunk for each virtual function that requires adjustment/readjustment of the this pointer, and then fill the vtable slot with the address of the thunk. The main job of the thunk is to adjust the this pointer and then jump to the actual virtual function. The thunk looks like:

C++

// Pseudo C++ code
Base2_destructor_thunk:
    this -= sizeof(base1);
    Derived::~Derived(this);

Now, let's look at the DeleteObj() function again. When pObj points to a Base2 object, the vtable slot for the destructor contains the address of Base2::~Base2(), so "delete pObj" simply calls Base2::~Base2(). When pObj points to a Derived object, the vtable slot for the destructor contains the address of the thunk (Base2_destructor_thunk, in this case), so "delete pObj" calls the thunk, which adjusts the this pointer and then jumps to Derived::~Derived().

In conclusion

There are other uses of thunks, such as API hooking, message filtering, and so on. But, the idea behind is the same: intercepting the call and modifying the call stack. The WindowWithThunk sample inserts the this pointer into the call stack; the UniversalProxy sample pushes two extra arguments onto the call stack; the MultipleInheritance sample modifies the this pointer already on the call stack.

Acknowledgements and references

ATL Internals: Working with ATL 8, Second Edition by Christopher Tavares, Kirk Fertitta, Brent Rector, Chris Sells. Published by Addison Wesley Professional.
Inside the C++ Object Model by Stanley B. Lippman. Published by Addison Wesley.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)