Introduction
SimpleBrowser
is intended to make it easier to use the WebBrowser
control in MFC applications. If you've ever tried this
before, you know that doing simple things can be overly complicated. I wanted to do the following with the control:
- Generate HTML in my code, and pass it to the control as a string.
- Navigate to HTML documents that are resources in my program.
- Catch events from the control.
- Print the control's content.
The WebBrowser
control exposes the IWebBrowser2
interface, which lets you control the browser. It signals events using the
DWebBrowserEvents2
interface, which lets you know what the browser is doing. Unfortunately, neither of these give you a clear-cut and simple way
to do the things I've listed.
The SimpleBrowser class
SimpleBrowser
wraps the IWebBrowser2
interface with MFC-friendly methods. The methods convert arguments into the form required by
the interface. Some of the IWebBrowser2
methods require BSTR
s, others VARIANT
s, and there's even the
occasional SAFEARRAY
, all of which can be awkward (or at least not intuitive) for casual MFC use.
How do you 'navigate' the WebBrowser
to a document in memory? An obvious solution is to write the document to a temporary file, and then navigate
the control to the file via a URL of the form file://c:\dir\filename.ext. This seems slow and cumbersome. You also need to delete the temporary file when
you're done. The IWebBrowser2
interface lets you retrieve an interface pointer to the current HTML document (IHTMLDocument2
),
which then lets you manipulate the document directly without the need for temporary files. The problem is, the interface isn't available until the browser
has completed navigating to a document. The interface does include a write()
method, so if we can get a document, we can use this to write our data to the browser.
The WebBrowser
supports the res: protocol, which lets you use program resources in your
HTML pages using the URL syntax res://EXE/type/resource. EXE
is the path to your executable. Type
is either the string name
of the resource type, or a string of the form #nnn
, where nnn
is the numerical value of one of the
predefined resource types (RT_BITMAP
, and so on). Resource
is the string name of the resource, or a string of the
form #nnn
, where nnn
is the resource ID. As long as you use the string form of the res: protocol, you can
navigate to a URL directly. This requires that you use the following type of statement in your .RC file:
"MyPage" "HTML" "MyPage.html"
You could then navigate to this page using the URL res://MyProgram.Exe/HTML/MyPage.
If you want to use the built-in RT_HTML
resource type and integer resource identifiers,
IDR_MY_PAGE HTML "MyPage.html"
the URL becomes a problem. The WebBrowser
control doesn't seem to like URLs with "#" characters, and using the replacement character "%23" is
messy (since it's followed by a numeric value). The WebBrowser
also does not seem to handle ANSI/UNICODE issues very well in either case.
The WebBrowser
lets you print the current document using the ExecWB()
method in the IWebBrowser2
interface.
Unfortunately, ExecWB()
is something of a catch-all, and requires several arcane arguments. It turns out, there is a way to specify header and
footer formats for printing, but the mechanism is awkward. SimpleBrowser
wraps this up in a single method.
SimpleBrowser
exposes the DWebBrowserEvents2
events via virtual functions that you can override in a derived class. Event data is
converted into MFC-friendly forms before being passed to the virtual functions. The base class versions of these functions send notifications to the parent
window via the standard WM_NOTIFY
mechanism.
Create(DWORD dwStyle, const RECT& rect, CWnd* pParentWnd,UINT nID)
Create
is the standard window creation function. The noteworthy thing about the Create(...)
function is what happens after
it creates the browser. SimpleBrowser
navigates to the predefined document about:blank. When that navigation completes,
any text passed to the SimpleBrowser
using the Write()
function (see below) is then written to the browser.
CreateFromControl(CWnd *pParentWnd,UINT nID)
CreateFromControl()
'creates' the SimpleBrowser
in a dialog by replacing another control. You can lay out your dialog, placing a
static control where you want the browser window. pParentWnd
in this case is a dialog box (this
, for example), while nID
identifies the static control.
CreateFromControl()
gets the location of the static control, destroys the static (since it won't be used), and then creates the browser in
its place using Create(...)
. The browser uses the ID originally given to the static control (nID
). You don't have to use a static
control; any control type will do. I use a static control with the 'static edge' style set (that makes it easier to see the extent of the control).
IHTMLDocument2 *GetDocument()
GetDocument()
returns an interface pointer to the IHTMLDocument2
interface for the current HTML document loaded in
the browser control. This interface pointer can be used to manipulate the document directly, in case you've got something special in mind that's not
supported by the direct methods supplied by SimpleBrowser
. GetDocument()
returns NULL
in case you've navigated
the control to something other than an HTML document. For example, the WebBrowser
control is perfectly happy to let you navigate to a
Microsoft Word document, which will be hosted by the control.
Write(LPCTSTR string)
Write(...)
lets you create an HTML document in a string, and display it in a browser window in your application. This is useful for creating
displays or reports that don't lend themselves to a fixed set of Windows controls, or information that needs special formatting. HTML is easy to
generate, and you get printing 'for free' (see the Print()
and PrintPreview()
methods below).
SimpleBrowser
writes the string to the WebBrowser
control using something like the following:
IHTMLDocument2 *document = GetDocument();
if (document != NULL) {
SAFEARRAY *safe_array = SafeArrayCreateVector(VT_VARIANT,0,1);
VARIANT *variant;
SafeArrayAccessData(safe_array,(LPVOID *)&variant);
variant->vt = VT_BSTR;
variant->bstrVal = CString(string).AllocSysString();
SafeArrayUnaccessData(safe_array);
document->write(safe_array);
document->Release();
document = NULL;
}
The Write(...)
method appends the string to the current document. One nicety is that the WebBrowser
control is tolerant
of HTML documents that are not 'well-formed':
<html><body>.... |
No trailing </body> or </html> tags. |
<html><body>...</body></html>
<html><body>...</body></html>
|
Multiple complete documents. |
Simple text with <b>tags</b>... |
No <html>...</html> or <body>...</body> tags at all. |
This lets you construct your document using several Write(...)
calls. You can also update the browser contents as needed, without having
to rebuild the whole document every time.
Clear()
Clear()
deletes any existing content in the WebBrowser
control. If you've got an HTML document in the control,
Clear()
empties the display by closing and re-opening the current document, and then refreshing the display. This appears to be faster than
navigating to about:blank, which is used when you don't have an HTML document in the control.
NavigateResource(int resource_ID)
The res: protocol lets you use resources in your executable,
in your HTML pages. As I mentioned earlier, the res: protocol isn't terribly friendly for URLs that you pass to the WebBrowser
, especially when
using numeric resource IDs. My solution is to load the HTML resource into a string and use the Write(...)
approach.
NavigateResource()
expects the resource to be defined in the .RC file as follows:
IDR_MY_PAGE HTML "MyPage.html"
HTML is the resource type used when you insert an HTML resource using the IDE.
NavigateResource()
gets interesting when it comes to loading UNICODE HTML resources in an application compiled for MBCS (ANSI), or vice
versa. My method is to convert the document to match the application. This works well until you have a UNICODE resource containing characters not in the ANSI
character set (Japanese Kanji, for example) and an MBCS application. In this case, the conversion does not work. This shouldn't hurt too much, because you
probably wouldn't be trying to display a Far Eastern HTML document in an ANSI application anyway.
Print(LPCTSTR header,LPCTSTR footer)
The IWebBrowser2
interface lets you print the current contents of the WebBrowser
control using the ExecWB
method.
The Print(...)
function I've supplied eliminates the cryptic arguments to ExecWB()
, and adds a simple way to specify the header and footer
for the printed page:
HRESULT hr;
VARIANT header_variant;
VariantInit(&header_variant);
V_VT(&header_variant) = VT_BSTR;
V_BSTR(&header_variant) =
CString(header).AllocSysString();
VARIANT footer_variant;
VariantInit(&footer_variant);
V_VT(&footer_variant) = VT_BSTR;
V_BSTR(&footer_variant) =
CString(footer).AllocSysString();
long index;
SAFEARRAYBOUND parameter_array_bound[1];
SAFEARRAY *parameter_array = NULL;
parameter_array_bound[0].cElements = 2;
parameter_array_bound[0].lLbound = 0;
parameter_array = SafeArrayCreate(VT_VARIANT,1,
parameter_array_bound);
index = 0;
hr = SafeArrayPutElement(parameter_array,
&index,
&header_variant);
index = 1;
hr = SafeArrayPutElement(parameter_array,
&index,
&footer_variant);
VARIANT parameter;
VariantInit(¶meter);
V_VT(¶meter) = VT_ARRAY | VT_BYREF;
V_ARRAY(¶meter) = parameter_array;
hr = _Browser->ExecWB(OLECMDID_PRINT,
OLECMDEXECOPT_DODEFAULT,
¶meter,
NULL);
if (!SUCCEEDED(hr)) {
VariantClear(&header_variant);
VariantClear(&footer_variant);
if (parameter_array != NULL) {
SafeArrayDestroy(parameter_array);
}
}
There is one caveat with using the Print(...)
function, caused by the way the WebBrowser
control handles printing.
The ExecWB()
method passes a copy of the document to a separate thread, which then performs the actual printing. The ExecWB()
method
returns immediately, without waiting for the thread to finish. For this reason, there is no simple way to determine that printing has completed. In fact, if the
browser is destroyed while printing is still in progress, only part of the contents will be printed. The WebBrowser
control does issue
the print template teardown event (see below), which is issued when printing has completed.
PrintPreview()
Displays the Print Preview for the current content loaded in the control.
Handling events
SimpleBrowser
is actually a basic CWnd
that is a container for the actual WebBrowser
control. Using MFC, the
WebBrowser
control signals events to its container via an 'event sink map'. The events that may be 'sunk' are described by the
DWebBrowserEvents2
interface. SimpleBrowser
forwards these events to the outside world by converting event information into
MFC-friendly forms (CString
's and so on) and calling a virtual function. The base class implementations of these functions send the events to
the parent window using the WM_NOTIFY
mechanism.
SimpleBrowser
supports these events:
Event (virtual function) |
NotificationType |
Description |
OnBeforeNavigate2(CString URL, CString frame, void *post_data, int post_data_size, CString headers) |
BeforeNavigate2 |
Called before navigation begins; URL is the destination, frame is the frame name ("" if none). The post_data
value will be NULL if there is no POST data. The headers value contains the headers to be sent to the server. Return true to cancel
the navigation, false to continue. |
OnDocumentComplete(CString URL) |
DocumentComplete |
Navigation to the document has completed; URL is the location. |
OnDownloadBegin() |
DownloadBegin |
Signals the beginning of a navigation operation. |
OnProgressChange(int progress, int progress_max) |
ProgressChange |
Navigation progress update. I've seen the WebBrowser signal ProgressChange events where progress > progress_max ,
so keep that in mind. |
OnDownloadComplete() |
DownloadComplete |
Navigation operation completed. |
OnNavigateComplete2(CString URL) |
NavigateComplete2 |
Navigation to a hyperlink has completed. URL is the location (URL = "about:blank " if
Write() or NavigateResource() are used). |
OnStatusTextChange(CString text) |
StatusTextChange |
Status text has changed. |
OnTitleChange(CString text) |
TitleChange |
Title text has changed. |
OnPrintTemplateInstantiation() |
PrintTemplateInstantiation |
Printing has begun. |
OnPrintTemplateTeardown() |
PrintTemplateTeardown |
Printing has completed. |
If you derive your own class from SimpleBrowser
, your event handlers can choose whether or not to inform the parent window of the event,
by whether or not they call the base class function.
Notification functions in the parent require an entry in the message map like this:
ON_NOTIFY(SimpleBrowser::NotificationType,control ID,OnNotificationType)
The functions themselves look like the following:
afx_msg void OnNotificationType(NMHDR *pNMHDR,LRESULT *pResult);
...
void MyDialog::OnNotificationType(NMHDR *pNMHDR,LRESULT *pResult)
{
SimpleBrowser::Notification
*notification = (SimpleBrowser::Notification *)pNMHDR;
...
*pResult = 0;
}
The Notification
structure is used to pass the information that describes the event:
Element |
Applies to NotificationType s |
Description |
NMHDR hdr |
All |
Standard notification header. hdr.hwndFrom = SimpleBrowser 's window handle, hdr.idFrom = SimpleBrowser 's control ID,
and hdr.code = NavigationType for the notification. |
CString URL |
BeforeNavigate2 ,
DocumentComplete ,
NavigateComplete2
|
URL of the navigation/document. |
CString frame |
BeforeNavigate2 |
The destination frame. |
void *post_data |
BeforeNavigate2 |
If the navigation includes POST data, post_data will point to a buffer containing the data. Note that the data is only valid
for the duration of the call. If the event handler needs to save the data, it should make a copy. If there is no POST data, post_data will be NULL . |
int post_data_size |
BeforeNavigate2 |
Size of the POST data; 0 if none. |
CString headers |
BeforeNavigate2 |
Headers to be sent to the server. |
int progress |
ProgressChange |
Current progress value. |
int progress_max |
ProgressChange |
Limit of the current progress value. |
CString text |
StatusTextChange ,
TitleChange
|
The text to be displayed in the status area or title. |
Please note that, even though I've used the standard WM_NOTIFY
mechanism and the NMHDR
notification header,
SimpleBrowser
doesn't support the common notifications like NM_CLICK
, since these events are handled internally by the WebBrowser
control.
About the code
SimpleBrowser
is compatible with the WebBrowser
control supplied by Internet Explorer version 5.0 and greater.
The demo program illustrates using SimpleBrowser
in a dialog application. Enter text into the edit control at the top, and click on the
Write button to invoke Write(...)
with the contents of the edit control. The Resource (ANSI) and Resource (UNICODE) buttons use
NavigateResource(...)
to display HTML documents in the program's resources, encoded as ANSI and UNICODE respectively.
The demo program uses a class derived from SimpleBrowser
(SimpleBrowser_Example
) to show how events are handled.
SimpleBrowser_Example
constructs a string describing the event and passes the string on to the dialog, which then displays the string in the edit
control at the bottom left of the dialog. The edit control at the bottom center of the dialog displays the same information as handled via the notification mechanism in the dialog itself.
I've included Visual C++ 6.0 workspace/project, Visual Studio .NET, Visual Studio 2005,
and Visual Studio 2008 solution/project files. The project compiles for either MBCS (e.g., ANSI) or UNICODE through a #define
found in stdafx.h.
Credits and references
I'd like to credit the following sources for information I used in developing the SimpleBrowser
class:
History
- April 6, 2003
- April 11, 2003
- Replaced the original
NavigateString()
method with Write(...)
and Clear()
.
- Added
GetDocument()
method.
- Added notifications.
- Modified
Create()
to wait for document available.
- Expanded the
BeforeNavigate2
event handling to include the POST data and headers.
- Revised article text as appropriate.
- December 31, 2011
- Updated links in article text.
- The original code used a poor approach to handle the requirement that a document be completely loaded into the
WebBrowser
control prior to writing its own text. In an early version it used a 'private' message pump, and the final version used a busy-wait loop with
a Sleep(0)
call in the Create()
function to wait for the document to be available. The new code removes the busy-wait
loop. Instead, if the document is not ready, the code simply saves the text to be written. When the document becomes ready, any text that has been deferred is then written to the browser.
- Bug fix: Corrected a memory leak in the
SimpleBrowser::Notification
class. The class did not provide a destructor,
which meant that any POST data passed in the notification would leak.
- Added a new
ParsePostData()
function to aid in handling POST data.
- Bug fix: The
OnBeforeNavigate2()
handler was not including the headers.
Thanks to Vic Mackey catching the error and providing the fix.
- Bug fix: Added keyboard translation (tab, delete, etc.) per a suggestion by cwswpl (Stephen).
Stephen also noticed the dodgy approach to handling the initial 'document ready' problem and suggested a solution very similar to the one I used. I didn't
use his suggestion for disabling the
WebBrowser
context menu, as the control includes an interface for customizing the context menu or eliminating it entirely.
- Enhancements: Made the
IWebBrowser2 *_Browser
member protected
rather than private
based on a suggestion
by Davide Calabro. Also modified the CreateFromControl()
function
to include a style argument, per another suggestion by Davide.
- Enhancement: Per code provided by Toni Bauer, added handling for the "print template
instantiation" and "print template teardown" events. The teardown event is especially useful, since it is triggered when printing has completed.
- February 12, 2011
- Bug fix: Corrected double-delete of post data in notification structure, found by qmcock.