Table of Contents
- Introduction
- Motivation
- Hooking the COM interface, IHTMLDocument3
- Finding problems in the code
- Solution #1: Using a naked function
- Solution #1: Limitations
- Solution #2: Implementing the TearoffThunk12 function to find out the real method
- Solution #2: Limitations
- Points of Interest
- Software based on this technique
- References
- History
Introduction
The basic concept of intercepting methods of COM components is quite simple and easy. As many articles [1, 2, 3, 4] explain, you can change an element of component's virtual table to redirect a function to your handler. Theoretically, this mechanism should work for all COM components. However, in reality, you may face some wield errors in implementing and debugging the technique with some COM components. For example, intercepting the IHTMLDocument3::attachEvent
method incurs a runtime error.
This article analyzes why these errors occur and presents solutions to the problem.
Motivation
I decided to write this article to share my experience and knowledge I obtained from implementing one of my applications. A month ago, I started developing an application that incapacitates script code that blocks the copy and paste feature. For example, as an article [5] states, some web sites register some script functions to prevent users from copying and pasting. Basically, most of them can be neutralized by removing registered event handlers. However, if an anonymous function is registered as an event handler by using attachEvent
function, the handler cannot be deregistered. This is because detachEvent
function requires the identifier of registered function that anonymous functions do not provide.
To figure out the problem, I decided to write code to hook the COM interface, IHTMLDocument3::attachEvent
.
Hooking the COM Interface, IHTMLDocument3
Since hooking the COM interface is a well documented topic, this section only includes the code and its detail, not the basic concept and background.
If you would like to see this basic idea in detail, the above articles[1, 2, 3, 4] would be good references.
To intercept a method of IHTMLDocument3
interface, the very first thing to do is obtaining a browser's interface. Since I am working with the Web Browser ActiveX control, I chose the 'DownloadBegin
' handler to retrieve the interface. By creating an event handler for 'DownloadBegin
', you may get the 'OnDownloadBeginExplorer1
' function. As using the below code, you can get the IHTMLDocument3
interface on every web page load. Also, the obtained interface will be available until the web page is reloaded.
void CxxxDlg::OnDownloadBeginExplorer1()
{
HRESULT hr;
IDispatch* lpDisp = m_web.GetDocument();
if(lpDisp){
IHTMLDocument3* pDoc3 = 0;
hr = lpDisp->QueryInterface( IID_IHTMLDocument3, (void**)&pDoc3);
if( SUCCEEDED(hr) && pDoc3 ) {
HookInterface1( pDoc3 );
pDoc3->Release();
} else {
OutputDebugString("Failed to query interface");
}
} else {
OutputDebugString("Failed to get document");
}
}
After that, the next step is constructing a C style interface structure of IHTMLDocument3
. Although you can hook a method without a C style interface by using the offset of method, I just use the C interface to make the code readable.
Anyway, the following structure is a part of IHTMLDocument3
structure. Since I just try to hook the attachEvent
method, the structure does not include all methods.
To make a C style interface for hooking, you can copy all methods from IDispatchVtbl
and the target interface you want to hook in order.
typedef struct IHookHTMLDocument3Vtbl
{
BEGIN_INTERFACE
HRESULT ( STDMETHODCALLTYPE *QueryInterface )(
IDispatch * This,
REFIID riid,
void **ppvObject);
ULONG ( STDMETHODCALLTYPE *AddRef )(
IDispatch * This);
ULONG ( STDMETHODCALLTYPE *Release )(
IDispatch * This);
HRESULT ( STDMETHODCALLTYPE *GetTypeInfoCount )(
IDispatch * This,
UINT *pctinfo);
HRESULT ( STDMETHODCALLTYPE *GetTypeInfo )(
IDispatch * This,
UINT iTInfo,
LCID lcid,
ITypeInfo **ppTInfo);
HRESULT ( STDMETHODCALLTYPE *GetIDsOfNames )(
IDispatch * This,
REFIID riid,
LPOLESTR *rgszNames,
UINT cNames,
LCID lcid,
DISPID *rgDispId);
HRESULT ( STDMETHODCALLTYPE *Invoke )(
IDispatch * This,
DISPID dispIdMember,
REFIID riid,
LCID lcid,
WORD wFlags,
DISPPARAMS *pDispParams,
VARIANT *pVarResult,
EXCEPINFO *pExcepInfo,
UINT *puArgErr);
HRESULT (STDMETHODCALLTYPE * releaseCapture)(
IHTMLDocument3* This);
HRESULT (STDMETHODCALLTYPE * recalc)(
IHTMLDocument3* This,
VARIANT_BOOL fForce);
HRESULT (STDMETHODCALLTYPE * createTextNode)(
IHTMLDocument3* This,
BSTR text,
IHTMLDOMNode **newTextNode);
HRESULT (STDMETHODCALLTYPE * get_documentElement)(
IHTMLDocument3* This,
IHTMLElement **p);
HRESULT (STDMETHODCALLTYPE * get_uniqueID)(
IHTMLDocument3* This,
BSTR *p);
HRESULT (STDMETHODCALLTYPE * attachEvent)(
IHTMLDocument3* This,
BSTR event,
IDispatch *pDisp,
VARIANT_BOOL *pfResult);
END_INTERFACE
} IHookHTMLDocument3Vtbl;
interface IHookHTMLDocument3
{
CONST_VTBL struct IHookHTMLDocument3Vtbl *lpVtbl;
};
The next is replacing the method with your function. By using the structure we made, you can easily access the address of method without calculating the offset. To overwrite the element of virtual table, the VirtualProtect
API is used to make the memory area writable. The HookInterface1
function does that work.
void CIETestDlg::HookInterface1(IHTMLDocument3 *pDoc3)
{
IHookHTMLDocument3* pHookDoc3 = (IHookHTMLDocument3*)pDoc3;
CString s;
s.Format("Address of Vtbl : %08X, FuncAddr : %08X, mshtml.dll : %08X",
pHookDoc3->lpVtbl, pHookDoc3->lpVtbl->attachEvent,
GetModuleHandle("mshtml.dll") );
OutputDebugString(s);
if( (PFNATTACHEVENT)pHookDoc3->lpVtbl->attachEvent !=
(PFNATTACHEVENT)New_attachEvent ) {
DWORD dwOldProt = 0;
if( VirtualProtect(&(pHookDoc3->lpVtbl->attachEvent),
4, PAGE_EXECUTE_READWRITE, &dwOldProt) ) {
g_pfnOrgAttachEvent = (PFNATTACHEVENT)pHookDoc3->
lpVtbl->attachEvent;
g_ppfnOrgAttachEvent = (PFNATTACHEVENT*)&
(pHookDoc3->lpVtbl->attachEvent);
pHookDoc3->lpVtbl->attachEvent = New_attachEvent;
VirtualProtect(&(pHookDoc3->lpVtbl->attachEvent),
4, dwOldProt, &dwOldProt);
} else {
OutputDebugString("Failed to unprotect the memory area");
}
} else {
OutputDebugString("This has been already hooked");
}
}
The following function is a new handler of attachEvent
method. It just shows a message box in order to check the function has been hooked.
HRESULT STDMETHODCALLTYPE New_attachEvent(
IHTMLDocument3* This,
BSTR event,
IDispatch *pDisp,
VARIANT_BOOL *pfResult)
{
CString szEvent ( event );
AfxMessageBox("The attachEvent method is intercepted: event = " + szEvent);
return g_pfnOrgAttachEvent( This, event, pDisp, pfResult );
}
Theoretically, it should work. However, as I stated at the very beginning of this article, it does not work properly. Although it works fine at first glance, the application crashes when you select an object and move your mouse pointer on it. The below picture shows an error occurred due to the above reason.
Finding Problems in the Code
To find out what was wrong in the code, the first step is inspecting the program by using a debugger. I inserted a breakpoint (int 3h
) in the New_attachEvent
function like the below code and ran it in the OllyDbg
.
HRESULT STDMETHODCALLTYPE New_attachEvent(
IHTMLDocument3* This,
BSTR event,
IDispatch *pDisp,
VARIANT_BOOL *pfResult)
{
__asm int 3h
return g_pfnOrgAttachEvent( This, event, pDisp, pfResult )}
The application will stop at the 'int 3h
' and the following screenshot shows the states including stacks and code.
As you can see, there is an event name, "oncontextmenu
", on the stack (at the right-bottom side of the screen). However, if you select an object and move the mouse pointer on it, the application will show a different stack which does not have an event name. The following picture shows that state.
How it could be? To see what happened in detail, I disassembled the original code of attachEvent
method. As you might already know, I printed the address of the original method in the HookInterface1
function to easily find the code address connected to the IHTMLDocument3
interface.
The following disassembled code is a function which the IHTMLDocument3::attachEvent
points. I copied it from the IDA disassembler.
.text:63988944 ; void __stdcall TearoffThunk12(void)
.text:63988944 ?TearoffThunk12@@YGXXZ proc near ; DATA XREF: .data:64030540 o
.text:63988944 ; .data:6403BAC0 o
.text:63988944
.text:63988944 arg_0 = dword ptr 4
.text:63988944
.text:63988944 ; FUNCTION CHUNK AT .text:63A40595 SIZE 00000008 BYTES
.text:63988944
.text:63988944 mov eax, [esp+arg_0]
.text:63988948 push eax
.text:63988949 test dword ptr [eax+1Ch], 1000h
.text:63988950 jnz loc_63A40595
.text:63988956
.text:63988956 loc_63988956: ;
CODE XREF: TearoffThunk12(void)+B7C54 j
.text:63988956 add eax, 0Ch
.text:63988959 mov ecx, [eax]
.text:6398895B mov [esp+4+arg_0], ecx
.text:6398895F mov ecx, [eax+4]
.text:63988962 mov ecx, [ecx+30h]
.text:63988965 pop eax
.text:63988966 mov word ptr [eax+20h], 0Ch
.text:6398896C jmp ecx
.text:6398896C ?TearoffThunk12@@YGXXZ endp
Clearly, the TearoffThunk12
seems to that it does not contain any relevant code for attaching an event handler. In other words, the function TearoffThunk12
does not do the real job for attachEvent
. It seems to that it just checks something and jumps to the real handler of attachEvent
. After an extensive research, I found that the TearoffThunk12
function is related to a tear-off interface, a dynamically created interface at runtime when it is requested[6]. Specifically, a tear-off interface is usually adopted to implement an interface which is rarely used at runtime in order to save memory space.
With the above information, we can now conclude that the IHTMLDocument3
interface is a tear-off interface. Also, since we observed that the TearoffThunk12
is called from other interfaces when a user moves the mouse point on a selected object, there may be another interface which shares the virtual table of IHTMLDocument3
interface.
By reverse-engineering callers of TearoffThunk12
, I found that a method of IHTMLElement
interface calls the TearoffThunk12
. And I verified it by using the below code:
void CIETestDlg::OnBtnshowvtbl()
{
HRESULT hr, hr2 ;
IDispatch* lpDisp = m_web.GetDocument();
if(lpDisp){
IHTMLDocument3* pDoc3 = 0;
IHTMLDocument2* pDoc2 = 0;
hr2 = lpDisp->QueryInterface( IID_IHTMLDocument2, (void**)&pDoc2);
hr = lpDisp->QueryInterface( IID_IHTMLDocument3, (void**)&pDoc3);
if( SUCCEEDED(hr) && SUCCEEDED(hr2) && pDoc3 && pDoc2) {
IHTMLElement* pElem = 0;
hr = pDoc2->get_body( (IHTMLElement**)&pElem );
if( SUCCEEDED(hr) && pElem ) {
CString s;
s.Format("%08X %08X =
pDoc2\n%08X %08X = pDoc3\n%08X %08X = pElem",
pDoc2, ((IHookHTMLDocument3*)pDoc2)->lpVtbl,
pDoc3, ((IHookHTMLDocument3*)pDoc3)->lpVtbl,
pElem, ((IHookHTMLDocument3*)pElem)->lpVtbl );
AfxMessageBox(s);
} else {
OutputDebugString("Failed to get body");
}
} else {
OutputDebugString("Failed to query interface");
}
} else {
OutputDebugString("Failed to get document");
}
}
As shown in the below picture, it is very interesting that pDoc3
and pElem
both share the virtual table, lpVtbl
in the above code.
The below picture shows that the result of test means IHTMLElement
and IHTMLDocument3
have the same virtual table.
Therefore, since pDoc3
and pElem
share the same lpVtbl
, when we changed an element of IHTMLDocument3(pDoc3)
, IHTMLElement(pElem)
is also affected.
However, the number of parameters of the original function can differ from one another. For example, the IHTMLDocument3::attachEvent
has 4 parameters whereas the IHTMLElement::put_id
method which corresponds to the attachEvent
method has 2 parameters. In this case, if IHTMLElement::put_id
is called, the stack will be broken and cause runtime errors (memory faults).
Solution #1: Using a Naked Function
In the above section, we have discovered the problem and what happened underneath the COM component. In this section, I suggest the first solution which uses a naked function as a handler.
As you learnt in the classroom, functions usually have prologue code in order to make their own function frame. As well, there is an epilogue code which removes the frame and restores the stack according to the calling convention (cdecl
, stdcall
, fastcall
, etc.) and number of parameters.
Therefore, since it is not determined that the number of parameters of original function which is called by the TearoffThunk12
function, we cannot use a function which has epilogue and prologue code. This is because the prologue and epilogue code will mess up the stack when the original function differs from the new handler's definition.
To figure this problem out, I used a naked function as a handler like the below code:
__declspec( naked ) HRESULT New_attachEvent_naked()
{
__asm {
mov eax, dword ptr[esp+8]
push eax
call New_attachEvent_Internal
}
__asm {
jmp g_pfnOrgAttachEvent
};
}
void __stdcall New_attachEvent_Internal(BSTR event)
{
WCHAR* pwstrEvent = (WCHAR*)event;
if( pwstrEvent ) {
MEMORY_BASIC_INFORMATION mbi;
memset( &mbi, 0, sizeof(mbi) );
if( VirtualQuery( pwstrEvent, &mbi, sizeof(mbi) ) > 0 ) {
DWORD dwBaseAddress = (DWORD)mbi.BaseAddress;
DWORD dwstrEvent = (DWORD)pwstrEvent;
if( dwstrEvent == 0 || dwBaseAddress == 0 ||
mbi.RegionSize == 0 ) {
return ;
}
if( ((DWORD)dwBaseAddress < (DWORD)dwstrEvent) &&
((DWORD)dwstrEvent < (DWORD)(dwBaseAddress +
mbi.RegionSize - 1)) &&
(mbi.State == MEM_COMMIT) &&
( ((mbi.Protect & PAGE_EXECUTE_READWRITE) ==
PAGE_EXECUTE_READWRITE) ||
((mbi.Protect & PAGE_EXECUTE_READ) ==
PAGE_EXECUTE_READ) ||
((mbi.Protect & PAGE_EXECUTE_WRITECOPY) ==
PAGE_EXECUTE_WRITECOPY) ||
((mbi.Protect & PAGE_READONLY) ==
PAGE_READONLY) ||
((mbi.Protect & PAGE_READWRITE) ==
PAGE_READWRITE) ||
((mbi.Protect & PAGE_WRITECOPY) ==
PAGE_WRITECOPY) ) &&
( !((mbi.Protect & PAGE_GUARD) == PAGE_GUARD)
)
)
{
if( pwstrEvent[0] == 'o' && pwstrEvent[1] == 'n' ) {
CString szEvent(pwstrEvent);
MessageBox( NULL, "The attachEvent method
is intercepted: event = " + szEvent,
"New_attachEvent_Internal",
MB_ICONINFORMATION|MB_OK );
}
}
}
}
}
It just pushes the first parameter(esp+8
) and calls the New_attachEvent_Internal
function. The New_attachEvent_Internal
function validates the parameter by using the VirtualQuery
API. Also, you can make sure it is a call for the attachEvent
method by checking the first 2 bytes of the event name. If it starts with "on
", it must be a request for the attachEvent
method. All these validation methods are implemented in the New_attachEvent_Internal
function. The below screenshot shows it works well.
Solution #1: Limitations
Although it works well now, you should be aware that it cannot guarantee that it will work well in the future. If the interface is changed or new interface which is very similar to IHTMLDocument3
, but not exactly the same, shares the IHTMLDocument3
's virtual table, it would cause another memory fault issues. In these cases, you may add more robust memory checking code to prevent errors.
Solution #2: Implementing the TearoffThunk12 Function to find out the Real Method
Another solution is to find out the real method by implementing the TearoffThunk12
function's code. The brief idea is very simple. According to the above findings, the TearoffThunk12
successfully calls the real methods using the address of instance of interface. Therefore, we can find the real method address if we trace and do what the function does.
The below code is the TearoffThunk12
function and I put some descriptions on that.
61198944 8B4424 04 MOV EAX,DWORD PTR SS:[ESP+4] // eax = this
61198948 50 PUSH EAX // push this
61198949 F740 1C 00100000 TEST DWORD PTR DS:[EAX+1C],1000 // if( *(this + 0x1C) &
// 0x1000 != 0 )
61198950 0F85 3F7C0B00 JNZ mshtml.61250595 // jmp mshtml.61250595
1. if( *(this + 0x1C) == 0x1000 ) then goto mshtml.61250595
(return code)
61198956 83C0 0C ADD EAX,0C // eax = this+0xC
61198959 8B08 MOV ECX,DWORD PTR DS:[EAX] // ecx = *(this+0xC)
6119895B 894C24 08 MOV DWORD PTR SS:[ESP+8],ECX // orgparam = *(this+0xC)
2. Replace original this pointer in the stack with '*(this + 0xC)'
6119895F 8B48 04 MOV ECX,DWORD PTR DS:[EAX+4] // ecx = *(this +0xC + 0x4)
61198962 8B49 30 MOV ECX,DWORD PTR DS:[ECX+30] // ecx = *
//( *(this + 0x10) + 0x30 ) )
61198965 58 POP EAX
61198966 66:C740 20 0C00 MOV WORD PTR DS:[EAX+20],0C // *(this+0x20) = 0xC
3. *(this + 0x20) = 0xC;
6119896C FFE1 JMP ECX // jmp to real handler
// (=ecx = *( *(this + 0x10) + 0x30 ) ) )
4. jmp *( *(this + 0x10) + 0x30 );
According to the above disassembled code, the TearoffThunk12
function can be summarized to the below C function.
__declspec( naked ) void TearoffThunk12(IDispatch* pThis)
{
if( (*(pThis + (0x1C/sizeof(IDispatch*))) & 0x1000) == 0 ) {
jmp Return_Thunk;
}
*pThis = *(pThis + (0xC/sizeof(IDispatch*)));
*(pThis + (0x20/sizeof(IDispatch*))) = 0xC;
jmp *( *(pThis + (0x10/sizeof(IDispatch*))) + (0x30/sizeof(IDispatch*)) );
}
The point is that it uses "*(*(this+0x10)+0x30)
" in order to call the original function. One more interesting thing is the number 2 of disassembled code. It replaces the original this pointer in the stack with *(this + 0xC)
. As you might know, the first parameter of all COM methods is "this pointer". The meaning of changing the first parameter into "*(this + 0xC)
" is that the tear-off object stores the original object on "*(this + 0xC)
".
Anyway, by inspecting that memory address, I found that it is the real object's virtual table. The full picture of the component is presented in the below picture:
The start address of object means the 'pDoc3
' pointer itself. It has several items and one of them is the real object of IHTMLDocument3
and its virtual table. In its virtual table, the 12th method is what we are looking for.
With these findings, I wrote a function for intercepting the method by changing the real object's element of virtual table.
void CIETestDlg::HookInterface3(IHTMLDocument3 *pDoc3)
{
CString s;
IHookHTMLDocument3* pHookDoc3 = (IHookHTMLDocument3*)pDoc3;
s.Format("Address of Vtbl : %08X FuncAddr : %08X",
pHookDoc3->lpVtbl,
pHookDoc3->lpVtbl->attachEvent );
OutputDebugString(s);
LPVOID* lpObj = (LPVOID*)pDoc3;
DWORD* pdwMask = (DWORD*)lpObj;
DWORD dwMask;
pdwMask = pdwMask + (0x1C / sizeof(DWORD));
dwMask = *pdwMask;
if( dwMask & 0x1000 ) {
OutputDebugString("Something wrong - Error");
} else {
LPVOID* lpAddress = NULL;
LPVOID* lpAddressElem = NULL;
lpAddress = lpObj + (0x10 / sizeof(LPVOID));
lpAddress = (LPVOID*)*lpAddress;
lpAddress = lpAddress + (0x30 / sizeof(LPVOID));
lpAddressElem = lpAddress; lpAddress = (LPVOID*)*lpAddress;
if( (PFNATTACHEVENT)lpAddress != (PFNATTACHEVENT)New_attachEvent ) {
DWORD dwOldProt = 0;
if( VirtualProtect(lpAddressElem, 4,
PAGE_EXECUTE_READWRITE, &dwOldProt) ) {
g_pfnOrgAttachEvent = (PFNATTACHEVENT)lpAddress;
g_ppfnOrgAttachEvent =
(PFNATTACHEVENT*)lpAddressElem;
*lpAddressElem = New_attachEvent;
VirtualProtect(lpAddressElem,
4, dwOldProt, &dwOldProt);
} else {
OutputDebugString
("Failed to unprotect the memory area");
}
} else {
OutputDebugString("This has been already hooked");
}
}
}
Solution #2: Limitations
Although the second solution seems to be more convincing and accurate, it is possible that the internal implementation of interface will be changed in the future.
Also, it is worth noting that I tested the second solution on systems which have IE6, IE7, IE8 and IE9. (It works well on all released IE versions with Windows XP and 7.) In all tests, the second solution perfectly achieved what we want without any minor errors. As well, it is difficult to imagine the changes of fundamentals of Internet Explorer. However, no one can still say it will not be changed.
Therefore, to avoid possible errors in the new version of Internet Explorer, you should verify that the method and interface is using the TearoffThunk12
function. This can be done by comparing the above disassembled code with the code in memory.
Points of Interest
There are a number of articles on this topic (intercepting something). However, as far as I know, most of them do not cover very complex practical examples. For example, intercepting COM methods were introduced several years ago. However, I was not able to find any article or thread which covers what I presented in this article.
In this article, you can get how the complex and practical COM component (e.g. tear-off interface) works. Also, you will be familiar with the basic concept of intercepting COM components. Lastly, this article also covers how to use a debugger to find out what happened in your application.
Software Based on this Technique
As I stated in the above section, I started solving this problem in order to write my application. I am distributing the application which uses this technique on my website (http://rodream.net).
Specifically, the name of application is 'C Browser(See Browser)' and it incapacitates some functional restriction scripts.
References
[1] Volodymyr Shamray, Interception Calls to COM Interfaces
[2] Galen C. Hunt, Michael L. Scott. Intercepting and Instrumenting COM Applications
[3] Zhefu Zhang, COM Interface Hooking and Its Application - Part I
[4] Martin Mueller, Hooking a DirectX/COM Interface
[5] Computerhope.com, Disable mouse right-click
[6] Andrew Whitechapel, ATL Tear-Off Interfaces
History
- Dec 15, 2011
- Added test results on IE8 (It works well)
- Typos and errors in the article were corrected
- Dec 12, 2011