Introduction
Part of my current project requires the ability to edit custom HTML documents. I'm using MSHTML (the core component of Microsoft Internet Explorer) in edit mode via
CHtmlEditView
. Whilst it has it's problems MSHTML has a couple of overwhelming advantages. It's free and it can be assumed to be present on almost every Windows computer you'll ever encounter. It's also not terribly difficult to work with.
The documents I'm working with contain images and LABEL
controls, all absolutely positioned. The act of editing consists pretty much of replacing boilerplate text and images and moving them around on the screen. A nice to have is snap-to-grid so everything can be easily lined up, and a visual representation of the grid.
Prior Art
A search on MSDN using 'snap to grid' located a match for MSHTML in code downloads (URL not included because they change) and a sample file called
EditHost.exe
which is a self extracting executable. The sample includes the source code for an MSHTML host which implements exactly the functionality I wanted. The sample is, however, a non MFC c++ application and the stuff of interest is wrapped up in a bunch of ATL classes.
Because my application is an MFC SDI application using the document/view architecture I decided to reimplement their solution using MFC. I chose not to try and use their ATL classes as presented because of some requirements imposed by the MFC document/view architecture. This didn't, however, stop me nicking some of their code :)
Basics
The functionality I wanted to implement falls into two pieces. The first is snap-to-grid. This basically requires the ability to intercept attempts to move or resize an element on the screen, receive the coordinates and modify those coordinates to enforce 'snap' granularity. The second piece is the ability to draw a grid on the MSHTML display surface.
Let's consider each of these pieces in turn.
Implementing snap-to-grid
MSHTML introduced the
IHTMLEditHost
COM interface in version 5.5 specifically to support snap-to-grid. You won't find an implementation of this interface in the standard libraries that ship with the Platform SDK - it's one you're expected to implement. In addition to the standard 3 methods in
IUnknown
(I won't mention the standard 3 methods anymore) it has one method,
SnapRect()
which MSHTML calls whenever you try to move or resize anything in the document you're currently editing. The parameters are an
IHTMLElement
interface for the selected element, the new rectangle (screen coordinates) for the element and a parameter specifying which drag handle is being used.
The element interface is useful because you can use it to query HTML attributes on the selected object. For example, you might want to be able to specify that a particular object is locked in place. You can set an attribute on the object specifying this. The SnapRect
method could query the attribute and force the element back to it's original location and size if it was set. I use this functionality in my application but the code isn't presented here because I don't want to get bogged down in a bunch of support code.
The new rectangle is a pointer to a RECT
containing the new coordinates of the object. If you change the coordinates inside your SnapRect()
method the object moves to match your changes.
The drag handle is used to decide which, if any, coordinates in the RECT
should be altered to force 'snap-to-grid'.
I said above that you are expected to implement IHTMLEditHost
which begs the question of how MSHTML gets the interface you've implemented. It gets the interface in a two step process. The first step is to request an IServiceProvider
interface on the host (your application). If it gets that interface it then requests an IHTMLEditHost
interface by calling IServiceProvider::QueryService()
.
Telling MSHTML to use your IHTMLEditHost implementation
Glossing over a lot of detail, the
CHtmlEditView
and
CHtmlView
classes implement a custom OLE control site. Both classes host MSHTML and provide a 'parent' interface that MSHTML can use to query for interfaces of interest. The stock class that handles the custom OLE control site is called
CHtmlControlSite
. The class only implements the
IDocHostUIHandler
interface which MSHTML uses to determine whether it should display Scollbars and suchlike.
(See my article here for some discussion of CHtmlControlSite
)[
^]
We can't derive a new class from CHtmlControlSite
for two reasons. The first is that the class definition isn't in a header file, it's in viewhtml.cpp
. More importantly, if we try to derive a class from CHtmlControlSite
the MFC COM Interface Macros[^] will bite us in the bum. Our only recourse is to reimplement the class, which we do as class CHtmlEditControlSite
.
In the following code I'm not presenting the entirety of the class definition. We're going to take it piece by piece. The entire class definition is, naturally, included in the download. Our replacement CHtmlEditControlSite
class starts out looking like this.
class CHTMLEditControlSite : public COleControlSite
{
public:
CHTMLEditControlSite(COleControlContainer* pParentWnd);
CHtmlView *GetView() const;
protected:
DECLARE_INTERFACE_MAP()
BEGIN_INTERFACE_PART(DocHostUIHandler, IDocHostUIHandler)
STDMETHOD(ShowContextMenu)(DWORD, LPPOINT, LPUNKNOWN, LPDISPATCH);
STDMETHOD(GetHostInfo)(DOCHOSTUIINFO*);
STDMETHOD(ShowUI)(DWORD, LPOLEINPLACEACTIVEOBJECT, LPOLECOMMANDTARGET,
LPOLEINPLACEFRAME, LPOLEINPLACEUIWINDOW);
STDMETHOD(HideUI)(void);
STDMETHOD(UpdateUI)(void);
STDMETHOD(EnableModeless)(BOOL);
STDMETHOD(OnDocWindowActivate)(BOOL);
STDMETHOD(OnFrameWindowActivate)(BOOL);
STDMETHOD(ResizeBorder)(LPCRECT, LPOLEINPLACEUIWINDOW, BOOL);
STDMETHOD(TranslateAccelerator)(LPMSG, const GUID*, DWORD);
STDMETHOD(GetOptionKeyPath)(OLECHAR **, DWORD);
STDMETHOD(GetDropTarget)(LPDROPTARGET, LPDROPTARGET*);
STDMETHOD(GetExternal)(LPDISPATCH*);
STDMETHOD(TranslateUrl)(DWORD, OLECHAR*, OLECHAR **);
STDMETHOD(FilterDataObject)(LPDATAOBJECT , LPDATAOBJECT*);
END_INTERFACE_PART(DocHostUIHandler)
BEGIN_INTERFACE_PART(ServiceProvider, IServiceProvider)
STDMETHOD(QueryService)(REFGUID, REFIID, void **);
END_INTERFACE_PART(ServiceProvider)
BEGIN_INTERFACE_PART(HTMLEditHost, IHTMLEditHost)
STDMETHOD(SnapRect)(IHTMLElement *pIElement, RECT *prcNew,
ELEMENT_CORNER eHandle);
XHTMLEditHost();
int m_iSnap;
END_INTERFACE_PART(HTMLEditHost)
};
The
DocHostUIHandler
stuff is a literal copy from the MFC implementation of
CHtmlControlSite
. It pretty much delegates everything to virtual functions on the view class which is derived directly or indirectly from
CHtmlView
.
The ServiceProvider
is our implementation of the IServiceProvider
interface. Recall that MSHTML calls this interface asking for an IHTMLEditHost
interface. It won't mind in the least if an attempt to get an IServiceProvider
interface or an IHTMLEditHost
interface fails but if the attempt to get an IHTMLEditHost
interface succeeds MSHTML will call IHTMLEditHost::SnapRect()
as appropriate.
Our implementation of ServiceProvider::QueryService()
looks like this.
STDMETHODIMP CHTMLEditControlSite::XServiceProvider::QueryService(REFGUID guidService,
REFIID riid,
void **ppObj)
{
METHOD_PROLOGUE_EX_(CHTMLEditControlSite, ServiceProvider)
HRESULT hr = E_NOINTERFACE;
*ppObj = NULL;
if (guidService == SID_SHTMLEditHost && riid == IID_IHTMLEditHost)
{
*ppObj = (void **) &pThis->m_xHTMLEditHost;
hr = S_OK;
}
return hr;
}
This checks if the service being requested is the MSHTML
IEditHost
interface. If so it returns a pointer to our
IHTMLEditHost
implementation. Our
IHTMLEditHost
declaration was shown above. The constructor
XHTMLEditHost()
initialises snapping to 8 pixel boundaries. The real guts of the interface is in our implementation of
SnapRect()
which looks like this.
STDMETHODIMP CHTMLEditControlSite::XHTMLEditHost::SnapRect(
IHTMLElement * ,
RECT * prcNew,
ELEMENT_CORNER eHandle)
{
if (GetAsyncKeyState(VK_CONTROL) & 0x10000000)
return S_OK;
LONG lWidth = prcNew->right - prcNew->left;
LONG lHeight = prcNew->bottom - prcNew->top;
switch (eHandle)
{
case ELEMENT_CORNER_NONE:
prcNew->top = ((prcNew->top + (m_iSnap / 2)) / m_iSnap) * m_iSnap;
prcNew->left = ((prcNew->left + (m_iSnap / 2)) / m_iSnap) * m_iSnap;
prcNew->bottom = prcNew->top + lHeight;
prcNew->right = prcNew->left + lWidth;
break;
.
.
.
}
Which does the appropriate arithmetic to force the
prcNew
rectangle onto snap boundaries depending on which resize handle was selected. The
GetAsyncKeyState(VK_CONTROL) & 0x10000000
tests to see if the control key (either one of them) is down. If so it exits immmediately, allowing the user to override our snap-to-grid functionality by holding down the control key as they drag the object around on the drawing surface.
Drawing the grid
This is a little more difficult. MSHTML doesn't arbitrarily ask us for an interface this time around, instead we have to register an element behavio(u)r with MSHTML at the appropriate time and then request or supply the necessary interfaces. The appropriate time is, of course, when our document has been loaded.
Of course, whilst you can grab a device context handle to any MSHTML window and paint on it, what you really want is to paint the grid before MSHTML renders the rest of it's display. Does the end user really want your gridlines drawn on top of their content? Achieving the correct painting order requires a bit of dancing with MSHTML.
Element behaviors
were added to Microsoft Internet Explorer version 5. They provide a 'hook' which can be used to modify the way a particular element behaves within an HTML page. The behavior can be many things but the one we're interested in is how the element is rendered. You can specify an element behavior on any element within the page as long as that element can return an
IHTMLElement2
interface. We register an element behavior by creating an object that implements the
IElementBehaviorFactory
interface and passing its address to the
IHTMLElement2::addBehavior
function. MSHTML then, as it renders the document, calls the behavior factory passing a bunch of parameters specifying exactly
which behavior it wants for this particlular element, as specified by that element, and only for elements that have behaviors attached in HTML or those which have had
addBehavior
called on them. The element behavior factory then returns an
IElementBehavior
interface to an object that implements the behavior.
Given that we want to draw a grid on the background of the entire document an obvious starting place is with the document interface itself, IHTMLDocument2
. Unfortunately this doesn't work because the document itself isn't an element, it's an element container. We need to go down one level and get an interface to the body of the document. Even though it's the body it's still an IHTMLElement2
interface, meaning we could go even deeper and draw a grid on a single element on the page if we wanted to.
Once we've got a pointer to the body element of the document we add our behavior factory to it. Sometime later MSHTML calls our behavior factory requesting an IElementBehavior
interface. We dutifully return one. MSHTML then calls the IElementBehavior::Init()
function on our element behavior object, passing a pointer to an IElementBehaviorSite
interface. Our application then calls QueryInterface()
on the IElementBehaviorSite
requesting an IHTMLPaintSite
interface. Once we get the IHTMLPaintSite
interface we invalidate the rectangle it represents which, since we requested it on the body of our HTML document, means we're invalidating the entire MSHTML display surface. MSHTML obliges by repainting the display surface and, in the process, requests an
IHTMLPainter interface and calls its Draw()
method, which is where we draw the grid. Phew!
Maybe a diagram will help
Arrow endpoints indicate the destination of the interface.
The Grid code
I won't clutter this article with repeated blocks of BEGIN_INTERFACE_PART
/END_INTERFACE_PART
macros. I'll assume you understand how the MFC COM Interface Macros work and continue with the code of interest. Let's look first at the code that initiates the entire process, the code that installs the grid handler. This is part of the outer class, CHTMLEditControlSite
and it's called by our application.
void CHTMLEditControlSite::InstallGrid(IHTMLDocument2 *pDoc)
{
HRESULT hr;
IHTMLElement *pBody = NULL;
IHTMLElement2 *pBody2;
VARIANT vFactory;
if (pDoc == (IHTMLDocument2 *) NULL)
return;
hr = pDoc->get_body(&pBody);
if (pBody == (IHTMLElement *) NULL)
return;
hr = pBody->QueryInterface(IID_IHTMLElement2, (void **) &pBody2);
if (pBody2 == (IHTMLElement2 *) NULL)
{
pBody->Release();
return;
}
if (m_gridCookie)
{
VARIANT_BOOL dummy;
hr = pBody2->removeBehavior(m_gridCookie, &dummy);
m_gridCookie = NULL;
}
V_VT(&vFactory) = VT_UNKNOWN;
V_UNKNOWN(&vFactory) = &m_xHTMLElementBehaviorFactory;
hr = pBody2->addBehavior(NULL, &vFactory, &m_gridCookie);
hr = pBody->Release();
hr = pBody2->Release();
return;
}
this starts out by obtaining an
IHTMLElement
interface to the body of the document. Once we've got that we get an
IHTMLElement2
interface. When we've got our
IHTMLElement2
interface we call
addBehavior
on it passing a pointer to our element behavior factory.
addBehavior
returns us a cookie which we'll need later to remove the behavior.
Not much of interest happens until MSHTML has requested a bunch of other interfaces from us. Our behavior factory is called and we return a pointer to our IElementBehavior
interface. MSHTML then calls our IElementBehavior::Init()
method which looks like this.
STDMETHODIMP CHTMLEditControlSite::XHTMLElementBehavior::Init(
IElementBehaviorSite *pBehaviorSite)
{
HRESULT hr = pBehaviorSite->QueryInterface(IID_IHTMLPaintSite,
(void **) &m_spPaintSite);
if (m_spPaintSite != (IHTMLPaintSite *) NULL)
m_spPaintSite->InvalidateRect(NULL);
return hr;
}
The method receives an
IElementBehaviorSite
interface pointer. Not by coincidence this represents the body object in the document (it's the body because we used the body interface when we registered the behavior factory). We get a pointer to an
IHTMLPaintSite
interface through the behavior site interface. Once we've got that we can invalidate the display surface and force repaints whenever we want.
Meantime MSHTML queries us for an IHTMLPainter
interface. One thing it needs to know is our Z-order. Should we be called first so MSHTML can paint stuff over what we've painted, or do we get last crack at the display surface? So MSHTML calls our IHTMLPainter::GetPainterInfo()
method which looks like this.
STDMETHODIMP CHTMLEditControlSite::XHTMLPainter::GetPainterInfo(
HTML_PAINTER_INFO *pInfo)
{
if (pInfo == NULL)
return E_POINTER;
pInfo->lFlags = HTMLPAINTER_TRANSPARENT;
pInfo->lZOrder = HTMLPAINT_ZORDER_BELOW_CONTENT;
memset(&pInfo->iidDrawObject, 0, sizeof(IID));
pInfo->rcExpand.left = 0;
pInfo->rcExpand.right = 0;
pInfo->rcExpand.top = 0;
pInfo->rcExpand.bottom = 0;
return S_OK;
}
which tells MSHTML to call us first.
Now it's drawing time. This code is trivial. MSHTML calls our IHTMLPainter::Draw()
method giving us the device context we should paint onto. All we do is draw our grid.
STDMETHODIMP CHTMLEditControlSite::XHTMLPainter::Draw(
RECT rcBounds,
RECT ,
LONG ,
HDC hdc,
LPVOID )
{
if (m_bGrid != FALSE)
{
HPEN redPen = (HPEN) CreatePen(PS_DOT, 0, RGB(0xff, 0x99, 0x99));
HPEN oldPen = (HPEN) SelectObject(hdc, redPen);
long lFirstLine = rcBounds.left + m_iGrid;
for (int i = lFirstLine; i <= rcBounds.right; i += m_iGrid)
{
MoveToEx(hdc, i, rcBounds.top, NULL);
LineTo(hdc, i, rcBounds.bottom);
}
lFirstLine = rcBounds.top + m_iGrid;
for (i = lFirstLine ; i <= rcBounds.bottom; i += m_iGrid)
{
MoveToEx(hdc, rcBounds.left, i, NULL);
LineTo(hdc, rcBounds.right, i);
}
SelectObject(hdc, oldPen);
DeleteObject(redPen);
}
return S_OK;
}
Using the code
In your
CHTMLEditView
derived view class header you need to add this function prototype. It's not documented in MSDN but fortunately it's a
public virtual
function.
virtual BOOL CreateControlSite(COleControlContainer* pContainer,
COleControlSite** ppSite,
UINT nID, REFCLSID clsid);
and add a
CHTMLEditControlSite
member variable to the data declarations in the header. I call it
m_pEditSite
. Add this function to your view implementation file.
BOOL CMyHTMLEditView::CreateControlSit(
COleControlContainer* pContainer,
COleControlSite** ppSite, UINT ,
REFCLSID )
{
ASSERT(ppSite != NULL);
*ppSite = m_pEditSite = new CHTMLEditControlSite(pContainer);
return TRUE;
}
At an appropriate place in your view class (maybe
OnDownloadComplete()
) add a call to
m_pEditSite->InstallGrid()
.
void CMyHTMLEditView::OnDownloadComplete()
{
CHtmlEditView::OnDownloadComplete();
m_pDoc = (IHTMLDocument2 *) GetHtmlDocument();
m_pEditSite->InstallGrid(m_pDoc);
m_pEditSite->Grid(TRUE);
}
Once you've got the grid installed for this document you can toggle it off and on by calling CHTMLEditControlSite::Grid()
passing TRUE
or FALSE
. If you navigate to another document you must call CHTMLEditControlSite::InstallGrid()
again to reinstall the grid handler.
AddRef() and Release() notes
If you look through the source code you may notice that in some places I handle AddRef()
and Release()
correctly whilst in other places I don't. This isn't laziness or lack of knowledge by me. The fact is that, unless I've seriously misunderstood COM reference counting, MSHTML doesn't seem to fully follow the reference counting rules. Sometimes it calls Release()
on the interface pointers we gave it, sometimes it doesn't. Our CHTMLEditControlSite
class is derived from CCmdTarget
which implements reference counting on our behalf. As I discussed in my previous article MFC COM Interface Macros[^] the CCmdTarget
destructor asserts (in debug builds) that the reference counter is less than or equal to 1. I found out by trial and error which interfaces ought to implement correct reference counting and which ones oughtn't. As it happens it doesn't matter that we're not correctly implementing COM reference counting given that our class is embedded as a member of the view class and won't go away until the view class goes away. One could argue that in this instance no interface within our class need implement correct reference counting but I prefer to do it the correct way whenever I can.
Notes on the demo program
The demo uses a hardcoded reference to an HTML page which, in turn has a hardcoded reference to the image file. You may need to tweak either the HTML page reference or the image reference within the HTML page.
History
25 April 2004 - Initial version.