Writing a BHO in Plain C++

cefarix

4.88/5 (45 votes)

6 Jun 2009CPOL17 min read

216.1K

10.5K

How to write an Internet Explorer plug-in (Browser Helper Object - BHO) using just C++ and the Windows API; no ATL or MFC involved!

The BHO running in Internet Explorer 8

Introduction
Background
Getting Started
Understanding the COM Code

COM Terms
The IUnknown Interface
COM DLL Exports
The IClassFactory Interface

Understanding the BHO Code

How Internet Explorer Loads the BHO
The IObjectWithSite Interface
The CEventSink Coclass

Handling Events

The BeforeNavigate2 Event

Installing the BHO
Using the Code
Conclusion
Revision History

Introduction

Browser Helper Objects (also called BHOs) are COM components that act as Internet Explorer plug-ins. BHOs can be used for customizing Internet Explorer to any extent: from user-interface modifications to web filters to download managers.

In this article, we will learn how to write and install a simple BHO in C++ without the use of extraneous frameworks like ATL or MFC. This example BHO can be used as the starting point for your own BHO.

Background

COM — or Component Object Model — is a language-neutral technology used extensively in Windows to enable the componentization and re-use of software modules.

Most COM code written in C++ (and hence in examples on the Web) uses the Active Template Library (ATL) and/or Microsoft Foundation Classes (MFC) frameworks to help get the job done. However, learning to use ATL and MFC effectively can often be yet another barrier to creating COM objects, specially simple ones like BHOs.

This article will teach you what you need to know about COM and BHOs to start writing your own BHOs, and do it using just C++ and the Windows API without being forced into a more complex framework like ATL or MFC.

Getting Started

The best way to understand this article is to download the source code from the link above and follow along. The source code is well-documented, and should be easy to understand.

Understanding the COM Code

COM Terms

First, let's get some COM terminology out of the way:

interface — A set of methods that is visible to other objects. It is the equivalent of public methods in a C++ pure-virtual class.
coclass — Derives from one or more interfaces, and implements all their methods with concrete functionality. It is the equivalent of an instantiable C++ class derived from one or more pure-virtual classes.
object — An instance of a coclass.
GUID — Globally Unique IDentifier — A GUID is a 128-bit unique number. It can be generated via the guidgen.exe utility.
IID — Interface IDentifier — A GUID that identifies an interface.
CLSID — CLaSs IDentifier — A GUID that identifies a COM component. You can find the example BHO's CLSID in the common.h file as CLSID_IEPlugin_Str. Every COM component has a different identifier, and you should generate a new one if you decide to build your own BHO from the example BHO code.

The IUnknown Interface

In order for us to create COM objects, we must write coclasses which implement interfaces. All COM objects must implement an interface known as the IUnknown interface. This interface has three very basic methods which allow other objects to memory-manage objects of a coclass as well as ask objects for other interfaces. These three methods are known as QueryInterface, AddRef, and Release. Since all the various coclasses we'll be implementing derive from IUnknown, it makes sense to create a coclass of IUnknown which implements the three IUnknown methods. We'll call this coclass CUnknown and have all our other coclasses derive from it so we don't have to write the implementation of the IUnknown methods separately for every coclass.

Note: You can find the definition and implementation of CUnknown in the unknown.h file.

COM DLL Exports

Every COM DLL exports four functions that are used by the COM system to create and manage COM objects from the DLL as well as to install and uninstall the DLL. These functions are:

DllGetClassObject
DllCanUnloadNow
DllRegisterServer
DllUnregisterServer

Note: You can find these functions in the main.cpp file, and they are declared as exports in the dll.def file.

Our DLL must have a coclass of the IClassFactory interface. We'll call this coclass CClassFactory. The DllGetClassObject function creates CClassFactory objects and returns interface pointers to them. The IClassFactory interface is explained in more detail shortly.

The DllCanUnloadNow function is called by COM to determine if our DLL can be unloaded from a process. All we have to do is check whether our DLL is currently managing any object, and return S_OK if we aren't, or S_FALSE if we are. We can do this by incrementing a DLL-global reference counter DllRefCount in the constructors of our coclasses and decrementing the counter in their destructors. If the reference counter is non-zero, it means that instances of our coclasses still exist and the DLL should not be unloaded at the moment.

DllRegisterServer is called by a program that wants our DLL to install itself. We have to register our COM component in the system and also as a BHO. We do this by creating the following Registry entries:

HKEY_CLASSES_ROOT\CLSID\<CLSID_IEPlugin_Str> — The default value of this key should be set to a human-readable description of the COM component. In this case, it's "CodeProject Example BHO".
HKEY_CLASSES_ROOT\CLSID\<CLSID_IEPlugin_Str>\InProcServer32 — The existence of this key identifies that this COM component can be loaded as a DLL into the process that wants to use it. There are two values we need to set for this key, as below:

(default) — A REG_SZ or REG_EXPAND_SZ value specifying the path to the DLL which contains the COM component.
ThreadingModel — This specifies the threading model of the COM component. It's a more advanced concept, and we don't need to worry about it. We just set it to "Apartment".

HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\ Explorer\Browser Helper Objects\<CLSID_IEPlugin_Str> — The existence of this key registers our COM component as a BHO. We create one value under this key named NoExplorer and set it as a REG_DWORD with a value of 1. Normally, BHOs are also loaded by explorer.exe, but this value prevents explorer.exe from unnecessarily loading our BHO.

DllUnregisterServer is called to do the exact opposite of DllRegisterServer — unregister our COM component and also remove it as a BHO. In order to do this, we just need to delete the Registry keys we create in DllRegisterServer.

The IClassFactory Interface

COM uses an IClassFactory object created by our DLL's DllGetClassObject function to get instances of other interface-implementations supported by the DLL. The IClassFactory interface defines the methods CreateInstance and LockServer. We call our coclass implementing the IClassFactory interface CClassFactory.

Note: You can find the definition of CClassFactory in ClassFactory.h, and the implementation in ClassFactory.cpp.

CreateInstance does exactly what it says — creates an instance of a coclass in our DLL that supports a given IID. Since we are a BHO, we only need to create instances of a coclass that supports the IObjectWithSite interface.

LockServer is called to either lock or unlock our DLL in memory. Depending on whether it is locking or unlocking, our implementation of LockServer simply increments or decrements the DLL-global DllRefCount variable.

Understanding the BHO Code

How Internet Explorer Loads the BHO

When a BHO is to be loaded by Internet Explorer, it calls a COM function called CoCreateInstance, passing it the CLSID of our BHO and the IID of an interface called IObjectWithSite. IObjectWithSite is explained in more detail below. COM in-turn loads our DLL into the Internet Explorer process by looking up our CLSID in the Registry, and then calls our exported DllGetClassObject function to get an instance of our CClassFactory coclass. Once COM has a pointer to our CClassFactory object, COM calls its CreateInstance method, passing it the IID that Internet Explorer supplied. Our implementation of CreateInstance creates an instance of our implementation of IObjectWithSite, which is called CObjectWithSite, and gets the interface pointer of the requested IID from it, and gives it to back to COM, which passes it on to Internet Explorer. Internet Explorer then uses the IObjectWithSite interface pointer to interact with our BHO.

The IObjectWithSite Interface

BHOs are required to implement the IObjectWithSite interface, which is how Internet Explorer communicates with a BHO. This interface has two methods, SetSite and GetSite. The coclass in our DLL which implements the IObjectWithSite interface is known as CObjectWithSite.

Note: You can find the definition of CObjectWithSite in ObjectWithSite.h, and the implementation in ObjectWithSite.cpp.

SetSite is called by Internet Explorer to give us a pointer to a site object. A site object is merely a COM object which is created by Internet Explorer and can be used by our BHO to communicate with Internet Explorer. Our implementation of SetSite gets an interface pointer to the IConnectionPointContainer interface from the site object. We then use a method FindConnectionPoint in IConnectionPointContainer to get an IConnectionPoint interface pointer to an object that supports the DWebBrowserEvents2 dispatch interface. A dispatch interface is a special type of interface which is derived from IDispatch and is used to receive event notifications through its Invoke method. Our implementation of DWebBrowserEvents2 is known as CEventSink, and is explained in more detail in the next section. We use the Advise method of the IConnectionPoint interface to tell Internet Explorer to pass events on to our CEventSink object.

Our implementation of SetSite also gets an interface pointer to the IWebBrowser2 interface from the site object. The IWebBrowser2 interface is implemented by Internet Explorer's site object, and has methods which can be used by us to interact with Internet Explorer.

Note: Since this example BHO only receives event notifications from Internet Explorer, and does not actually control Internet Explorer, we don't make use of any of the methods in IWebBrowser2. However, I have included the code to get the IWebBrowser2 interface so that you can use it in your own BHO if needed. You can find the documentation for IWebBrowser2 here.

GetSite is called by Internet Explorer to know which object we have currently set as the site object. Internet Explorer passes us an IID for an interface it wants from our currently set site object, and we simply get that interface from our site object and give it back to Internet Explorer.

The CEventSink Coclass

The CEventSink coclass is our implementation of the DWebBrowserEvents2 dispinterface. DWebBrowserEvents2 derives from IDispatch but does not implement any methods of its own. Instead, DWebBrowserEvents2 exists only so that its unique DIID (dispatch IID) can exist. This DIID identifies the events that a coclass of DWebBrowserEvents2 can receive on its implementation of the IDispatch method Invoke.

Note: You can find the definition of CEventSink in EventSink.h, and the implementation in EventSink.cpp.

When Internet Explorer wants to notify us of an event, it will call the Invoke method of CEventSink, passing the event ID in the dispIdMember parameter and other information about the event in the pDispParams parameter. IDispatch also has three other methods besides Invoke (GetTypeInfoCount, GetTypeInfo, and GetIDsOfNames), but we don't need to implement any functionality for them because we only receive events.

Another difference that you might notice about CEventSink from our other coclasses is that it does not derive from CUnknown. This is because we only need one DLL-global statically-allocated instance of CEventSink, called EventSink. Because of this, we don't need to implement any reference counting, and hence don't need the reference counting and memory-management functionality of CUnknown.

Handling Events

Internet Explorer calls CEventSink::Invoke to notify us of events. The dispIdMember parameter contains an ID identifying which event is being fired, and the pDispParams->rgvarg[] array contains the arguments of the event itself as an array of VARIANTs. You can see what arguments an event takes by looking it up in the documentation of DWebBrowserEvents2, which can be found here. The arguments are passed in the pDispParams->rgvarg[] array in the opposite order they are listed in the event's documentation. In order to convert these arguments from a VARIANT to a more usable type, we first declare a VARIANT for every argument we will use and initialize it using the VariantInit API function. Then, we can use the VariantChangeType API function to convert the VARIANTs in the pDispParams->rgvarg[] array to a VARIANT of a more usable type. Once we have done this for all the arguments we will need, we can pass the value of the arguments to our own Event_* method by using an appropriate member of our converted VARIANT variables. After the event handler method returns, we free any resources used by our converted VARIANTs by calling the VariantClear method on each of them. A concrete example of how to do this is given below.

The BeforeNavigate2 Event

Our example BHO just handles one event, the BeforeNavigate2 event. The documentation for BeforeNavigate2 can be found here. This event is fired just before Internet Explorer navigates to a new location.

Taking a look at the documentation, we see that BeforeNavigate2 gives us seven arguments. We will not concern ourselves with the pDisp argument, which is just an IDispatch interface pointer to the site object.

The event arguments are stored in the pDispParams->rgvarg[] as variants. Before we can use these arguments, though, we need to convert the variants to different variant types which are easier to use. In order to do this, we have an array of variants in the Invoke method which stores the converted variants. We first need to initialize each of these variants using the VariantInit function, and when we are done, we free any used resources by using the VariantClear function.

The variant array in the Invoke method

C++

VARIANT v[5];
// Used to hold converted event parameters before
// passing them onto the event handling method

...
for(n=0;n<5;n++) VariantInit(&v[n]); // initialize the variant array
... // use the variant array here
for(n=0;n<5;n++) VariantClear(&v[n]); // free the variant array

The url argument contains the URL that is being navigated to. It is the fifth argument from last (the last one being the 0th), so we access it as pDispParams->rgvarg[5]. We convert this variant to a variant type of VT_BSTR as it may not be in that format, and store the converted variant into v[0]. We can then access the URL as a BSTR string by using v[0].bstrVal. A BSTR string is commonly used in COM to pass string data. It consists of a 4-byte prefix indicating the length of the string, followed by the string data as a double-byte character NULL-terminated Unicode string. A BSTR variable always points to the string data, and not to the 4-byte prefix before it, which conveniently allows us to use it as a C-style string. A double-byte character string type is declared in the Windows API headers as LPOLESTR, regardless of whether the program is using the wide-character set by default. So, we can pass the string v[0].bstrVal as an LPOLESTR parameter to the Event_BeforeNavigate2 event-handling method.

How to access the url argument

C++

VariantChangeType(&v[0],&pDispParams->rgvarg[5],0,VT_BSTR);
// make sure the argument is of variant type VT_BSTR

...
Event_BeforeNavigate2( (LPOLESTR) v[0].bstrVal , ... );
// pass the url argument to the event handler

The Flags argument contains information about whether the navigation is a result of an external window or tab. It is the fourth argument from last, so we access it as pDispParams->rgvarg[4]. We convert this variant to a variant type of VT_I4, and store the converted variant into v[1]. VT_I4 means a 4-byte signed integer, so we can then access the value as a long through v[1].lVal, and pass it on to Event_BeforeNavigate2.

How to access the Flags argument

C++

VariantChangeType(&v[1],&pDispParams->rgvarg[4],0,VT_I4);
// make sure the argument is of variant type VT_I4 (a long)

...
Event_BeforeNavigate2( ... , v[1].lVal , ... );
// pass the Flags argument to the event handler

The TargetFrameName argument contains the name of the frame in which the navigation is happening. It is the third argument from last, so we access it as pDispParams->rgvarg[3]. We convert this variant to a variant type of VT_BSTR and store the converted variant into v[2]. We can then access the string value as a LPOLESTR through v[2].bstrVal, just like with the url argument, and similarly pass it on to Event_BeforeNavigate2.

How to access the TargetFrameName argument

C++

VariantChangeType(&v[2],&pDispParams->rgvarg[3],0,VT_BSTR);
// make sure the argument is of variant type VT_BSTR
...
Event_BeforeNavigate2( ... , (LPOLESTR) v[2].bstrVal , ... );
// pass the TargetFrameName argument to the event handler

The PostData argument contains POST data if the navigation is due to an HTTP POST request. It is the second argument from last, so we access it as pDispParams->rgvarg[2]. The documentation states that PostData is of the variant type VT_BYREF|VT_VARIANT. This means that PostData is actually a pointer to another variant. Reading further in the Remarks section of the documentation, we can see that the variant that is pointed to contains a SAFEARRAY. A SAFEARRAY is often used in COM to contain array data. We convert the PostData variant to a variant of type VT_UI1|VT_ARRAY and store the converted variant into v[3]. VT_UI1|VT_ARRAY means that after the conversion, v[3] is a variant that points to a SAFEARRAY which has a 1-dimensional array of 1-byte unsigned integers. Before accessing the SAFEARRAY data in v[3], we first need to check if the conversion to VT_UI1|VT_ARRAY was successful. If the navigation did not contain any POST data, then the conversion would not have been successful and the variant type of v[3] would be VT_EMPTY. On the other hand, if the data exists, we can access the SAFEARRAY the variant points to by using v[3].parray, and we can access the data within that SAFEARRAY using the SafeArray*() API functions.

First, we get the size of the data in the array using the functions SafeArrayGetLBound and SafeArrayGetUBound. These functions retrieve the lower and upper bounds of the array, respectively. Subtracting the lower bound from the upper bound and adding 1 gives us the number of elements in the array. We then access the actual data by using the function SafeArrayAccessData, which gives us a pointer to the data and also locks the array. Since the array's elements are of type 1-byte unsigned integer, we can access the data as a C-style array of unsigned chars. We pass the data pointer and data size to Event_BeforeNavigate2. Afterwards, we unlock the array by calling SafeArrayUnaccessData.

How to access the PostData argument

C++

PVOID pv;
LONG lbound,ubound,sz;
...
VariantChangeType(&v[3],&pDispParams->rgvarg[2],0,VT_UI1|VT_ARRAY);
// make sure the argument is a variant containing
// a SAFEARRAY of 1-byte unsigned integers

if(v[3].vt!=VT_EMPTY) {
  // If the conversion was successful, we have POST data
  SafeArrayGetLBound(v[3].parray,0,&lbound); // get the lower bound (first element index)
  SafeArrayGetUBound(v[3].parray,0,&ubound); // get the upper bound (last element index)
  sz=ubound-lbound+1; // use the bounds to calculate the data size
  SafeArrayAccessData(v[3].parray,&pv); // get access to the data
} else {
  // If the conversion was not successful, we do not have any POST data
  sz=0; // set data size to zero
  pv=NULL; // set the data pointer to NULL
}
...
Event_BeforeNavigate2( ... , (PUCHAR) pv , sz , ... );
// pass the pointer to the data and the data size
// of the PostData argument to the event handler
...
if(v[3].vt!=EMTPY) SafeArrayUnaccessData(v[3].parray);
// if we had previously accessed the data in the SAFEARRAY, unaccess it now

The Headers argument contains any additional HTTP headers that were sent for the navigation. It is the next-to-last argument, so we access it as pDispParams->rgvarg[1]. We convert this variant to a variant of type VT_BSTR and store the converted variant into v[4]. We pass the data on to Event_BeforeNavigate2 in the same fashion as the url and TargetFrameName arguments.

How to access the Headers argument

C++

VariantChangeType(&v[4],&pDispParams->rgvarg[4],0,VT_BSTR);
// make sure the argument is of variant type VT_BSTR
...
Event_BeforeNavigate2( ... , (LPOLESTR) v[4].bstrVal , ... );
// pass the Headers argument to the event handler

The Cancel argument is a return value that indicates to Internet Explorer whether it should continue or cancel the navigation. It is a variant of type VT_BYREF|VT_BOOL, which means it contains a pointer to a VARIANT_BOOL type, which takes a value of either VARIANT_TRUE or VARIANT_FALSE. It is the last argument, so we access it as pDispParams->rgvarg[0]. We can access the VARIANT_BOOL that this argument points to via *(pDispParams->rgvarg[0].pboolVal). If we set the VARIANT_BOOL pointed to by the Cancel argument to VARIANT_TRUE, Internet Explorer cancels the navigation. If we set it to VARIANT_FALSE, Internet Explorer continues the navigation as normal. Because more than one BHO can be handling the BeforeNavigate2 event, the pre-existing value of the VARIANT_BOOL pointed to by the Cancel argument can correspond to a value set by a BHO that handled BeforeNavigate2 before our BHO. We pass the existing value to Event_BeforeNavigate2 as a bool value, so Event_BeforeNavigate2 can decide whether to override the existing value or keep it. The return value of Event_BeforeNavgiate2 is a bool that indicates the new value of the Cancel argument.

How to access the Cancel argument

C++

bool b;
...

b = Event_BeforeNavigate2( ... ,
    ( (*(pDispParams->rgvarg[0].pboolVal)) != VARIANT_FALSE ) );
// pass the pre-existing value of the Cancel argument
// to the event handler, and get the new value for it
...
// Set the new value of the Cancel argument based upon
// the return value of Event_BeforeNavigate2()
if(b) *(pDispParams->rgvarg[0].pboolVal)=VARIANT_TRUE;
else *(pDispParams->rgvarg[0].pboolVal)=VARIANT_FALSE;

Note: You can find the code for the BeforeNavigate2 event handler, Event_BeforeNavigate2, in the EventSink.cpp file.

Installing the BHO

Installing the BHO merely requires that a process calls our BHO's DllRegisterServer function. This is made simple via the regsvr32.exe utility. Simply run the command regsvr32.exe <path to the BHO dll>, and the BHO should be registered. To uninstall the BHO, a process needs to call our BHO's DllUnregisterServer function. This can also be accomplished via regsvr32.exe by running the command regsvr32.exe /u <path to the BHO dll>.

Using the Code

Use the source code for this example BHO as a starting point for your own BHO.

Note: Don't forget to generate a new CLSID_IEPlugin and CLSID_IEPlugin_Str for your own BHO! You can generate a new CLSID by using the guidgen.exe utility. You can find CLSID_IEPlugin defined in main.cpp and CLSID_IEPlugin_Str in common.h.

You can start by customizing the Event_BeforeNavigate2 event handler for your own needs. You can also handle more events by adding new event handler methods to the CEventSink class and calling them from CEventSink's Invoke method.

Note: You can find the definition of CEventSink in the EventSink.h file, and the implementation of CEventSink and Event_BeforeNavigate2 in the EventSink.cpp file.

Conclusion

I hope you have enjoyed reading this article and learned some new stuff. Feel free to leave comments with questions, corrections, and suggestions!

Revision History

2009-06-06:

Initial posting.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)