Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Another new thunk copy from ATL

0.00/5 (No votes)
29 Jul 2012 1  
A useful skill to get rid of ATL, but do similar flexible architectures by yourself.

Introduction     

A useful skill to get rid of ATL, also providing a core similar to ATL, you will be able to quickly starting from a lite Framework written by yourself, don't need to write procedure c++ code based on windows SDK if you grasp this new thunk technique. it makes meaningful that your code is more clearly and not looking like c++ only.    

Background  

Most of windows developers knew how to work on ATL framework, a fewer peoples would knew the core mechanism that ATL used a thunk techniques to pass This pointer of a class instance in the WindowProc callback function, the This pointer can be fetched by the 1st parameter - hWnd substituted by assemble code, this techniques is not like MFC that it takes a mapping table to look for the This pointer, so it has higher performance rather than MFC.  

Well, I'm not a guru in C++ even not in ATL, in spare time, I like to develop some smaller window apps. and be want a high performance for it. so I choose ATL framework as a base of windows app., but the base is still too big, in other words, the base still has complex encapsulations by C++ template feature, in my case, it's not much better specially for a smaller windows app., there is another choice - directly coding on windows SDK architectures, but it's not able to effectively utilize the class feature to wrap many methods as a component.  

Be in a dilemma, After I did some investigations and dived into the core of ATL framework, Now, I knew how ATL grabs the This pointer to a class instance, and passing the This pointer in WindowProc,   it's the thunk techniques that we mentioned before.

The principle is that it pushes the pThis and the related Thread ID to create window in a global list maintained by _ATLModule, then pop the pThis form the global list in WindowProc, by my understanding and searching some view of points on google, the principle should be reliable since a thread is not able to create many window at same time, and the calling WindowProc has the FIFO feature, i.e, the first created window will take the first This pointer in the global list.  

In my case, I'm planning to get rip of ATL, but also want to use its thunk techniques in my code, so make code more effectively wrapped in class. the question is If I'm not intending to use ATL, how do I fetch the thunk mechanism? as you knew, the 'thunk' details is maintained by _ATLModule, so looking like we're not able to get rid of it, must finding a good way.. Smile | <img src=   

Using the code 

After fighting over a night, I found the key point - the 'thunk' data structure where aims to passing This pointer of a class in WindowProc, I did some changes based on original one (you can find it in <atlstdthunk.h>), the below new 'thunk' data structure is what I did changes, you can keep in mind first, thought it's as a new thunk technique! 

#pragma pack(push,1)
struct _stdcallthunk
{
    DWORD   m_mov;          // 
    DWORD   m_this;         //
    BYTE    m_jmp;          // jmp WndProc
    DWORD   m_relproc;      // relative jmp
    BOOL Init(DWORD_PTR proc, void* pThis)
    {
        m_this = PtrToUlong(pThis);
        m_jmp = 0xe9;
        m_relproc = DWORD((INT_PTR)proc - ((INT_PTR)this+sizeof(_stdcallthunk)));
        // write block from data cache and
        //  flush from instruction cache
        FlushInstructionCache(GetCurrentProcess(), this, sizeof(_stdcallthunk));
        return TRUE;
    }
    //some thunks will dynamically allocate the memory for the code
    void* GetCodeAddress()
    {
        return this;
    }
};
#pragma pack(pop)  

If you compare the changed code to original one in <atlstdthunk.h>, you will find out m_mov is missing in the Init function. 

Yes, I removed it already, I will explain why it is later on,   

For conveniently, I did a dialog demo project where's using the new 'thunk' technique, In the CTestDlg class , I placed two instances of the new 'thunk' data structure. Please following the below code: 

Note, the below sections will involve with a little assemble language knowledge, but it's not required, you can attempt to read it anyway, or just skip it, then check for the dialog demo project directly.  

If there is no special declaration, we'll only concern in X86 thunks, i.e, where the thunk codes are following the directive #if defined(_M_IX86). (Remark 1.

HANDLE CTestDlg::s_hPrivHeap = NULL;

CTestDlg::CTestDlg(void)
{
	// Uses Heap to construct the thunk(s) for avoiding DEP (Data Execution Prevention)
	if (!s_hPrivHeap)
	{
		// dwMaximumSize is zero that means it specifies that the private heap is growable. 
		// The heap's size is limited only by available memory. 
		s_hPrivHeap = ::HeapCreate(HEAP_CREATE_ENABLE_EXECUTE, 0, 0);
		if (!s_hPrivHeap) throw "error: failed to create private heap!";
	}

	m_thunk = (_stdcallthunk*)::HeapAlloc(s_hPrivHeap, HEAP_ZERO_MEMORY, sizeof(_stdcallthunk));//(_stdcallthunk*)VirtualAlloc(NULL, sizeof(_stdcallthunk), MEM_COMMIT, PAGE_EXECUTE_READWRITE);
	if (!m_thunk) throw "error: m_thunk cannot be allocated by HeapAlloc";
#if defined(_M_IX86) 
	// mov dword ptr [esp+0x14], pThis (esp + 0x14 is the custom 5th param- pThis)
	m_thunk->m_mov = 0x142444C7;
#elif defined (_M_AMD64)
	// mov r9, pThis (r9 is the 64 bit register to store the 4th parameter - lParam
	m_thunk->m_mov = 0xb949; // mov r9, pThis
#endif
	
	m_thunk2 = (_stdcallthunk*)::HeapAlloc(s_hPrivHeap, HEAP_ZERO_MEMORY, sizeof(_stdcallthunk));//(_stdcallthunk*)VirtualAlloc(NULL, sizeof(_stdcallthunk), MEM_COMMIT, PAGE_EXECUTE_READWRITE);
	if (!m_thunk2) throw "error: m_thunk2 cannot be allocated by HeapAlloc";
#if defined(_M_IX86)
	// mov dword ptr [esp+0x4], pThis (esp+0x04 is hWnd, pThis is assigned to hWnd)
	m_thunk2->m_mov = 0x042444C7; 
#elif defined (_M_AMD64)
	// mov rcx, pThis (rcx is the 64 bit register to store the 1st parameter - hWnd
	m_thunk2->m_mov = 0xb948; 
#endif
} 

In the constructor of the class - CTestDialog, I used HeapCreate to allocate memory for the two instances of struct _stdcallthunk, and flagged the memory page as HEAP_CREATE_ENABLE_EXECUTE (Remark 2.  ),  this will avoid DEP (Data Execution Prevention) issue, i.e, if the thunk instances are initialized normally, the memory page(s) to the thunk won't be marked as executable, once the DEP is enabled in system advanced settings, the thunk will be crashing!  

 

Take your time to understand DEP issue, Let's focus on the key codes where I placed the different assemble instruction for m_mov respectively for the two thunk instances. 

With the first instance - m_thunk , we just get rid of the the global list in ATL where to grab the This pointer, instead of appending an extra parameter on stack frame through mov instruction:   

m_thunk.m_mov = 0x142444C7;           

The corresponds to -   

mov dword ptr [esp+0x14], pThis    

 Now, we're able to retrieve the This in StartDialogProc:  

INT_PTR CALLBACK CTestDlg::StartDialogProc(HWND hwndDlg,  // handle to dialog box
                            UINT uMsg,     // message  WPARAM wParam, 
                            WPARAM wParam, // first message parameter
                            LPARAM lParam,  // second message parameter
                            DWORD_PTR This
                            )
{
    CTestDlg* pThis = (CTestDlg*)This;

    pThis->m_hDlg = hwndDlg;

    // Initalize next thunk ..
    pThis->m_thunk2->Init((DWORD_PTR)pThis->DialogProc, pThis);
    DLGPROC pProc = (DLGPROC)pThis->m_thunk2->GetCodeAddress();
    DLGPROC pOldProc = (DLGPROC)::SetWindowLongPtr(hwndDlg, DWLP_DLGPROC, (LONG_PTR)pProc);

    return pProc(hwndDlg, uMsg, wParam, lParam);
}  

Once we retrieved the This pointer, we assign the current HWND handle to pThis->m_hDlg, then we further initialize the next thunk - m_thunk2,  this will be forwarding to the real window procedure - DialogProc, and the first parameter hwndDlg in DialogProc will be substituted by the This pointer.   

INT_PTR CALLBACK CTestDlg::DialogProc(HWND hwndDlg,  // handle to dialog box
                            UINT uMsg,     // message  WPARAM wParam, 
                            WPARAM wParam, // first message parameter
                            LPARAM lParam  // second message parameter
                            )
{
    CTestDlg* pThis = (CTestDlg*)hwndDlg; // We take out pThis from hwndDlg
...   

The all above processing is similar to the original ATL thunk mechanism,  the difference is that we used two different thunks(i.e, different mov instruction) independing on a global list where to get This pointer back , so we get rid of ATL completely!   

Points of Interest   

This new thunk technique is really not a creative, I just applied a new approach to retrieve the This pointer, thus, you're not only getting rid of ATL , you also get a chance to keep tidiness code as me mentioned at beginning, another hand, it's absolute benefit for enhance performance when you only write a smaller windows app. Smile | <img src= 

B.t.w, the new thunk technique is only working on X86, I would like to make a little of investigation on X64 after soon.  

Remarks 

1. The demo project also supports X64 thunks, the assemble code for X64 thunks has a little of changes, it's not like X86 thunks, it does pass the This pointer in the 4th parameter - lParam in StartDialogProc , on X64, it manages stack frame by the caller not the callee, if we place the This pointer on stack frame [rsp+xxh], then we could still have to manually write more assemble codes to restore stack frame(the thunk code is the caller), this is too complex to me, so I pass the This pointer to the 4th parameter - lParam only, basically, when the StartDialogProc is called initially, lParam is nothing, so why I choose this parameter to save This pointer. I'm not sure that I'm understanding correctly.  

2. Earlier, I used VirtualAlloc to construct thunk instances dynamically, but it was too wasting memory space that I ever created each thunk in a new memory page(4k), even you want to room other thunks in same one memory page, you will still have to write extra code to create a list then detecting which thunk is freed or not.., HeapCreate / HeapAlloc / HeapFree will be more easily auto-manage memory behind a heap, it already realized the intelligence.  

History       

2012-7-28: Fixing DEP (Data Execution Prevention) issue.   

2012-7-29: Thunks can be auto-managed by private heap in process.  and new demo project supports X64 thunks.  

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here