Introduction
A programmer who works with Internet Explorer must pay attention to where the other extension DLLs are. This is because these other extension DLLs interact with Internet Explorer and generate new problems and features. The programmer has to deal with this situation. However, there are no documents about this. In this document, I will explain the detailed solution.
COM Component
COM is not a simple Win32 DLL. COM is created by the Interface and to call its function, use the Invoke
function. For these reasons, the COM DLL's function is not in the EXPORT table and cannot retrieve its entry point by GetPRocAddress
. This is the reason why we cannot access its real code and why we cannot know its base DLL.
We can find out about all of a DLL's base address and size with the ToolHelp
API. However, in case of COM, the instance of the DLL is not in the DLL address area. So, we cannot compare the ToolHelp
API's result. For example, if we use function ABCD
, in the case of a simple DLL, we use it as ABCD();
and we can get the address of the function by ABCD;
. However, in the case of COM, we have to use it via the Invoke
function. So, there is no raw access point of the ABCD
function (method).
Internet Explorer and Connection Point
In order to receive an event from Internet Explorer, connect a COM object using the Connection Point interface. Just like a Virtual Function in C++, a connected COM object's function will be called by Internet Explorer. If you want to know the details, please refer to books on COM. Connection objects have a Connection Point; this interface provides a method for enumerating connection points. To use this method, we can get the IUnknown
interface pointer and the cookie value. The cookie value is used to connect to Internet Explorer.
To connect to Internet Explorer, an object uses the Advise
and Unadvise
functions. These functions' result value is a cookie. It is a connection identifier. One funny tip is that the Unadvise
function disconnects an object from Internet Explorer even if the cookie value is not yours. The IUnknown
object which you get by this method has to release by the Release
function. This is because, in the process of Enumeration
(on the Enumeration
function), AddRef
is called and it increases the reference count.
Go into the COM
Let's dig into the COM component. To know about COM, we start with implementation of a COM component. Open the Unknwn.h file and we can see the following code:
extern "C++"
{
MIDL_INTERFACE("00000000-0000-0000-C000-000000000046")
IUnknown
{
public:
BEGIN_INTERFACE
virtual HRESULT STDMETHODCALLTYPE QueryInterface(
REFIID riid,
void __RPC_FAR *__RPC_FAR *ppvObject) = 0;
virtual ULONG STDMETHODCALLTYPE AddRef( void) = 0;
virtual ULONG STDMETHODCALLTYPE Release( void) = 0;
template<class>
HRESULT STDMETHODCALLTYPE QueryInterface(Q** pp)
{
return QueryInterface(__uuidof(Q), (void **)pp);
}
END_INTERFACE
};
}
The code says that the IUnknown
interface consists of three Virtual
functions. The Virtual
function saves the implementation function's address. To understand this code, you have to know about the implementation of the Virtual
function. Let's look at the following example to study the Virtual
function:
class CVirtualTest {
public:
virtual void VFunc1()
{
}
virtual void VFunc2()
{
}
};
class CVirtualTest2 : public CVirtualTest{
public:
void VFunc1()
{
DWORD dwEIP = 0;
_asm {
call GETEIP3
GETEIP3:
pop eax
mov dword ptr[dwEIP], eax
}
TRACE("CVirtualTest2::VFunc1 = EIP : %X\n", dwEIP);
}
void VFunc2()
{
DWORD dwEIP = 0;
_asm {
call GETEIP4
GETEIP4:
pop eax
mov dword ptr[dwEIP], eax
}
TRACE("CVirtualTest2::VFunc2 = EIP : %X\n", dwEIP);
}
};
void main()
{
CVirtualTest2 v;
TRACE("v is %X\n", &v);
v.VFunc1();
v.VFunc2();
}
Let's try debugging. First, set the break point to the main
function and start the application. This is the result of TRACE
and the debugger after the creation of v
.
We can see the addresses. The address of v
is 0x0012F5F8
and that of __vfptr
is 0x004154d8
. The Virtual
function's addresses are 0x00401014
and 0x0040100a
. As you can see, the Virtual
Function Table's real address is the real code's address and if you know this address, we can know where the code is. By IUnknown
's definition, IUnknown
is a struct
(class
) which has a Virtual
function. So, if we know exactly about the Virtual
Function Table, we can say "We know IUnknown
." First, let's see the CVirtualTest'
s address (0x12F5F8
).
We can see 0x004154D8
in 0x0012F5F8
. Aren't you familiar with this? Right! It is the same as __vfptr
in the Debugger Window. In other words, __vfptr
is the first member of struct
when class
only has the Virtual<code>
function. Let's go to the __vfptr
's address.
We can see 0x00401014
, 0x0040<code>
100A at 0x004154D8
. These are the addresses of VFunc1
and VFunc2
. At last, we get the Virtual
function's address. Collect this, go to the class object address and we can get the Virtual
Function Table's address. Go to the Virtual
Function Table's address and we can get each function's address.
Integrate Codes
Back to the subject: find the connected Connection Points of Internet Explorer. This work can be done with IConnectionPoint::EnumConnections
. We can get the IEnumConnections
object from the EnumConnections
method and get each connection information by the IEnumConnections::Next
method. This returns CONNECTDATA
. CONNECTDATA
contains the IUnknown
interface and we can get IDispatch
through it. The IDispatch
interface has a similar form as IUnknown
and inherits from IUnknown
.
IEnumConnections* pConn = NULL;
hr = pConnectionPoint->EnumConnections(&pConn);
if(!SUCCEEDED(hr)) {
return ;
}
CONNECTDATA sConnData;
ULONG uRet = 0;
while(true) {
uRet = 0;
memset(&sConnData, 0, sizeof(CONNECTDATA));
hr = pConn->Next(1, &sConnData, &uRet);
if(hr != S_OK || uRet != 1)
break;
LPVOID* lpVFT = (LPVOID*)(sConnData.pUnk);
if(IsBadReadPtr(lpVFT, sizeof(LPVOID)) == FALSE) {
CString szOutput;
szOutput.Format("Filename : %s, 0x%x",
GetProcessFileName( (DWORD)(*lpVFT) ), *lpVFT );
OutputDebugString(szOutput):
}
sConnData.pUnk->Release();
}
In this code, we cast the IUnknown
pointer to LPVOID*
and confirm the address by using IsBadReadPtr
. lpVFT
's content is the Virtual
Function Table's address. So, *lpVFT
is lpVFT[0]
. In conclusion, it is the Virtual
Function Table's first element, the first function's address. To compare this address with the ToolHelp
API's result, we can see what DLL has the function. The GetProcessFileName
function does this work in the recently used code.
CString GetProcessFileName(DWORD dwAddress)
{
BOOL bRet = FALSE;
BOOL bFound = FALSE;
HANDLE hModuleSnap = NULL;
MODULEENTRY32 me32 = {0};
DWORD dwBase, dwSize;
hModuleSnap = CreateToolhelp32Snapshot(TH32CS_SNAPMODULE, GetCurrentProcessId());
if (hModuleSnap == INVALID_HANDLE_VALUE)
return "";
me32.dwSize = sizeof(MODULEENTRY32);
if (Module32First(hModuleSnap, &me32)) {
do {
dwBase = (DWORD)me32.modBaseAddr;
dwSize = (DWORD)me32.modBaseSize;
if((dwBase < dwAddress) && ((dwBase + dwSize) > dwAddress)) {
CloseHandle(hModuleSnap);
return me32.szExePath;
}
} while (Module32Next(hModuleSnap, &me32));
}
CloseHandle (hModuleSnap);
return "";
}
Points of Interest
The COM component's implementation is interesting. These days, Windows is almost wrapped by the COM component and we cannot do programming without COM. So, it's important to understand COM component's implementation.
History
- 20th January, 2008: First release