(untagged)

Implementing snap-to-grid in an MSHTML based application

Rob Manderson

0.00/5 (No votes)

24 Apr 2004

How to implement snap-to-grid and draw the grid in MSHTML based applications.

Sample Image - snaptogrid.jpg

Introduction

Part of my current project requires the ability to edit custom HTML documents. I'm using MSHTML (the core component of Microsoft Internet Explorer) in edit mode via CHtmlEditView. Whilst it has it's problems MSHTML has a couple of overwhelming advantages. It's free and it can be assumed to be present on almost every Windows computer you'll ever encounter. It's also not terribly difficult to work with.

The documents I'm working with contain images and LABEL controls, all absolutely positioned. The act of editing consists pretty much of replacing boilerplate text and images and moving them around on the screen. A nice to have is snap-to-grid so everything can be easily lined up, and a visual representation of the grid.

Prior Art

A search on MSDN using 'snap to grid' located a match for MSHTML in code downloads (URL not included because they change) and a sample file called EditHost.exe which is a self extracting executable. The sample includes the source code for an MSHTML host which implements exactly the functionality I wanted. The sample is, however, a non MFC c++ application and the stuff of interest is wrapped up in a bunch of ATL classes.

Because my application is an MFC SDI application using the document/view architecture I decided to reimplement their solution using MFC. I chose not to try and use their ATL classes as presented because of some requirements imposed by the MFC document/view architecture. This didn't, however, stop me nicking some of their code :)

Basics

The functionality I wanted to implement falls into two pieces. The first is snap-to-grid. This basically requires the ability to intercept attempts to move or resize an element on the screen, receive the coordinates and modify those coordinates to enforce 'snap' granularity. The second piece is the ability to draw a grid on the MSHTML display surface.

Let's consider each of these pieces in turn.

Implementing snap-to-grid

MSHTML introduced the IHTMLEditHost COM interface in version 5.5 specifically to support snap-to-grid. You won't find an implementation of this interface in the standard libraries that ship with the Platform SDK - it's one you're expected to implement. In addition to the standard 3 methods in IUnknown (I won't mention the standard 3 methods anymore) it has one method, SnapRect() which MSHTML calls whenever you try to move or resize anything in the document you're currently editing. The parameters are an IHTMLElement interface for the selected element, the new rectangle (screen coordinates) for the element and a parameter specifying which drag handle is being used.

The element interface is useful because you can use it to query HTML attributes on the selected object. For example, you might want to be able to specify that a particular object is locked in place. You can set an attribute on the object specifying this. The SnapRect method could query the attribute and force the element back to it's original location and size if it was set. I use this functionality in my application but the code isn't presented here because I don't want to get bogged down in a bunch of support code.

The new rectangle is a pointer to a RECT containing the new coordinates of the object. If you change the coordinates inside your SnapRect() method the object moves to match your changes.

The drag handle is used to decide which, if any, coordinates in the RECT should be altered to force 'snap-to-grid'.

I said above that you are expected to implement IHTMLEditHost which begs the question of how MSHTML gets the interface you've implemented. It gets the interface in a two step process. The first step is to request an IServiceProvider interface on the host (your application). If it gets that interface it then requests an IHTMLEditHost interface by calling IServiceProvider::QueryService().

Telling MSHTML to use your IHTMLEditHost implementation

Glossing over a lot of detail, the CHtmlEditView and CHtmlView classes implement a custom OLE control site. Both classes host MSHTML and provide a 'parent' interface that MSHTML can use to query for interfaces of interest. The stock class that handles the custom OLE control site is called CHtmlControlSite. The class only implements the IDocHostUIHandler interface which MSHTML uses to determine whether it should display Scollbars and suchlike. (See my article here for some discussion of CHtmlControlSite)[^]

We can't derive a new class from CHtmlControlSite for two reasons. The first is that the class definition isn't in a header file, it's in viewhtml.cpp. More importantly, if we try to derive a class from CHtmlControlSite the MFC COM Interface Macros[^] will bite us in the bum. Our only recourse is to reimplement the class, which we do as class CHtmlEditControlSite.

In the following code I'm not presenting the entirety of the class definition. We're going to take it piece by piece. The entire class definition is, naturally, included in the download. Our replacement CHtmlEditControlSite class starts out looking like this.

class CHTMLEditControlSite : public COleControlSite
{
public:
                    CHTMLEditControlSite(COleControlContainer* pParentWnd);

    CHtmlView       *GetView() const;

protected:
// Implementation

    DECLARE_INTERFACE_MAP()

    //  This is the implementation of the IDocHostUIHandler interface

    //  MSHMTL gets this interface from us so we have to reference count it.

    BEGIN_INTERFACE_PART(DocHostUIHandler, IDocHostUIHandler)
        STDMETHOD(ShowContextMenu)(DWORD, LPPOINT, LPUNKNOWN, LPDISPATCH);
        STDMETHOD(GetHostInfo)(DOCHOSTUIINFO*);
        STDMETHOD(ShowUI)(DWORD, LPOLEINPLACEACTIVEOBJECT, LPOLECOMMANDTARGET, 
                          LPOLEINPLACEFRAME, LPOLEINPLACEUIWINDOW);
        STDMETHOD(HideUI)(void);
        STDMETHOD(UpdateUI)(void);
        STDMETHOD(EnableModeless)(BOOL);
        STDMETHOD(OnDocWindowActivate)(BOOL);
        STDMETHOD(OnFrameWindowActivate)(BOOL);
        STDMETHOD(ResizeBorder)(LPCRECT, LPOLEINPLACEUIWINDOW, BOOL);
        STDMETHOD(TranslateAccelerator)(LPMSG, const GUID*, DWORD);
        STDMETHOD(GetOptionKeyPath)(OLECHAR **, DWORD);
        STDMETHOD(GetDropTarget)(LPDROPTARGET, LPDROPTARGET*);
        STDMETHOD(GetExternal)(LPDISPATCH*);
        STDMETHOD(TranslateUrl)(DWORD, OLECHAR*, OLECHAR **);
        STDMETHOD(FilterDataObject)(LPDATAOBJECT , LPDATAOBJECT*);
    END_INTERFACE_PART(DocHostUIHandler)

    //  This is the implementation of the IServiceProvider interface

    //  MSHMTL gets this interface from us so we have to reference count it.

    BEGIN_INTERFACE_PART(ServiceProvider, IServiceProvider)
        STDMETHOD(QueryService)(REFGUID, REFIID, void **);
    END_INTERFACE_PART(ServiceProvider)

    //  This is the implementation of the IHTMLEditHost interface

    //  MSHMTL gets this interface from us so we have to reference count it.

    BEGIN_INTERFACE_PART(HTMLEditHost, IHTMLEditHost)
        STDMETHOD(SnapRect)(IHTMLElement *pIElement, RECT *prcNew,
                            ELEMENT_CORNER eHandle);

                    XHTMLEditHost();

        int         m_iSnap;
    END_INTERFACE_PART(HTMLEditHost)
};

The DocHostUIHandler stuff is a literal copy from the MFC implementation of CHtmlControlSite. It pretty much delegates everything to virtual functions on the view class which is derived directly or indirectly from CHtmlView.

The ServiceProvider is our implementation of the IServiceProvider interface. Recall that MSHTML calls this interface asking for an IHTMLEditHost interface. It won't mind in the least if an attempt to get an IServiceProvider interface or an IHTMLEditHost interface fails but if the attempt to get an IHTMLEditHost interface succeeds MSHTML will call IHTMLEditHost::SnapRect() as appropriate.

Our implementation of ServiceProvider::QueryService() looks like this.

STDMETHODIMP CHTMLEditControlSite::XServiceProvider::QueryService(REFGUID guidService,
                                                     REFIID riid, 
                                                     void **ppObj)
{
    METHOD_PROLOGUE_EX_(CHTMLEditControlSite, ServiceProvider)

    HRESULT hr = E_NOINTERFACE;

    *ppObj = NULL;

    if (guidService == SID_SHTMLEditHost && riid == IID_IHTMLEditHost)
    {
        *ppObj = (void **) &pThis->m_xHTMLEditHost;
        hr = S_OK;
    }

    return hr;
}

This checks if the service being requested is the MSHTML IEditHost interface. If so it returns a pointer to our IHTMLEditHost implementation. Our IHTMLEditHost declaration was shown above. The constructor XHTMLEditHost() initialises snapping to 8 pixel boundaries. The real guts of the interface is in our implementation of SnapRect() which looks like this.

STDMETHODIMP CHTMLEditControlSite::XHTMLEditHost::SnapRect(
                  IHTMLElement * /*pIElement*/, 
                  RECT * prcNew, 
                  ELEMENT_CORNER eHandle)
{
    if (GetAsyncKeyState(VK_CONTROL) & 0x10000000)
        //  If the control key is down return (no snap).

        return S_OK;

    LONG lWidth = prcNew->right - prcNew->left;
    LONG lHeight = prcNew->bottom - prcNew->top;

    switch (eHandle)
    {
    case ELEMENT_CORNER_NONE:
        prcNew->top = ((prcNew->top + (m_iSnap / 2)) / m_iSnap) * m_iSnap;
        prcNew->left = ((prcNew->left + (m_iSnap / 2)) / m_iSnap) * m_iSnap;
        prcNew->bottom = prcNew->top + lHeight;
        prcNew->right = prcNew->left + lWidth;
        break;
   
    //  Other cases

    .
    .
    .
}

Which does the appropriate arithmetic to force the prcNew rectangle onto snap boundaries depending on which resize handle was selected. The GetAsyncKeyState(VK_CONTROL) & 0x10000000 tests to see if the control key (either one of them) is down. If so it exits immmediately, allowing the user to override our snap-to-grid functionality by holding down the control key as they drag the object around on the drawing surface.

Drawing the grid

This is a little more difficult. MSHTML doesn't arbitrarily ask us for an interface this time around, instead we have to register an element behavio(u)r with MSHTML at the appropriate time and then request or supply the necessary interfaces. The appropriate time is, of course, when our document has been loaded.

Of course, whilst you can grab a device context handle to any MSHTML window and paint on it, what you really want is to paint the grid before MSHTML renders the rest of it's display. Does the end user really want your gridlines drawn on top of their content? Achieving the correct painting order requires a bit of dancing with MSHTML.

Element behaviors

were added to Microsoft Internet Explorer version 5. They provide a 'hook' which can be used to modify the way a particular element behaves within an HTML page. The behavior can be many things but the one we're interested in is how the element is rendered. You can specify an element behavior on any element within the page as long as that element can return an IHTMLElement2 interface. We register an element behavior by creating an object that implements the IElementBehaviorFactory interface and passing its address to the IHTMLElement2::addBehavior function. MSHTML then, as it renders the document, calls the behavior factory passing a bunch of parameters specifying exactly which behavior it wants for this particlular element, as specified by that element, and only for elements that have behaviors attached in HTML or those which have had addBehavior called on them. The element behavior factory then returns an IElementBehavior interface to an object that implements the behavior.

Given that we want to draw a grid on the background of the entire document an obvious starting place is with the document interface itself, IHTMLDocument2. Unfortunately this doesn't work because the document itself isn't an element, it's an element container. We need to go down one level and get an interface to the body of the document. Even though it's the body it's still an IHTMLElement2 interface, meaning we could go even deeper and draw a grid on a single element on the page if we wanted to.

Once we've got a pointer to the body element of the document we add our behavior factory to it. Sometime later MSHTML calls our behavior factory requesting an IElementBehavior interface. We dutifully return one. MSHTML then calls the IElementBehavior::Init() function on our element behavior object, passing a pointer to an IElementBehaviorSite interface. Our application then calls QueryInterface() on the IElementBehaviorSite requesting an IHTMLPaintSite interface. Once we get the IHTMLPaintSite interface we invalidate the rectangle it represents which, since we requested it on the body of our HTML document, means we're invalidating the entire MSHTML display surface. MSHTML obliges by repainting the display surface and, in the process, requests an IHTMLPainter interface and calls its Draw() method, which is where we draw the grid. Phew!

Maybe a diagram will help

Sample Image - maximum width is 600 pixels

Arrow endpoints indicate the destination of the interface.

The Grid code

I won't clutter this article with repeated blocks of BEGIN_INTERFACE_PART/END_INTERFACE_PART macros. I'll assume you understand how the MFC COM Interface Macros work and continue with the code of interest. Let's look first at the code that initiates the entire process, the code that installs the grid handler. This is part of the outer class, CHTMLEditControlSite and it's called by our application.

void CHTMLEditControlSite::InstallGrid(IHTMLDocument2 *pDoc)
{
    HRESULT hr;

    IHTMLElement  *pBody = NULL;
    IHTMLElement2 *pBody2;
    VARIANT       vFactory;

    if (pDoc == (IHTMLDocument2 *) NULL)
        return;

    // Get IHTMLElement and IHTMLElement2 interfaces for the body

    hr = pDoc->get_body(&pBody);

    if (pBody == (IHTMLElement *) NULL)
        return;

    hr = pBody->QueryInterface(IID_IHTMLElement2, (void **) &pBody2);

    if (pBody2 == (IHTMLElement2 *) NULL)
    {
        pBody->Release();
        return;
    }

    if (m_gridCookie)
    {
        VARIANT_BOOL dummy;
        hr = pBody2->removeBehavior(m_gridCookie, &dummy);
        m_gridCookie = NULL;
    }

    // Convert the grid factory pointer to the proper VARIANT data type 

    // for IHTMLElement2::AddBehavior

    V_VT(&vFactory) = VT_UNKNOWN;
    V_UNKNOWN(&vFactory) = &m_xHTMLElementBehaviorFactory;

    // Add Grid behavior

    hr = pBody2->addBehavior(NULL, &vFactory, &m_gridCookie);

    // Release resources

    hr = pBody->Release();
    hr = pBody2->Release();
    return;
}

this starts out by obtaining an IHTMLElement interface to the body of the document. Once we've got that we get an IHTMLElement2 interface. When we've got our IHTMLElement2 interface we call addBehavior on it passing a pointer to our element behavior factory. addBehavior returns us a cookie which we'll need later to remove the behavior.

Not much of interest happens until MSHTML has requested a bunch of other interfaces from us. Our behavior factory is called and we return a pointer to our IElementBehavior interface. MSHTML then calls our IElementBehavior::Init() method which looks like this.

STDMETHODIMP CHTMLEditControlSite::XHTMLElementBehavior::Init(
                                       IElementBehaviorSite *pBehaviorSite)
{
    HRESULT hr = pBehaviorSite->QueryInterface(IID_IHTMLPaintSite, 
                                        (void **) &m_spPaintSite);

    if (m_spPaintSite != (IHTMLPaintSite *) NULL)
        m_spPaintSite->InvalidateRect(NULL);

    return hr;
}

The method receives an IElementBehaviorSite interface pointer. Not by coincidence this represents the body object in the document (it's the body because we used the body interface when we registered the behavior factory). We get a pointer to an IHTMLPaintSite interface through the behavior site interface. Once we've got that we can invalidate the display surface and force repaints whenever we want.

Meantime MSHTML queries us for an IHTMLPainter interface. One thing it needs to know is our Z-order. Should we be called first so MSHTML can paint stuff over what we've painted, or do we get last crack at the display surface? So MSHTML calls our IHTMLPainter::GetPainterInfo()method which looks like this.

STDMETHODIMP CHTMLEditControlSite::XHTMLPainter::GetPainterInfo(
                                       HTML_PAINTER_INFO *pInfo)
{
    if (pInfo == NULL)
        return E_POINTER;

    pInfo->lFlags = HTMLPAINTER_TRANSPARENT;
    pInfo->lZOrder = HTMLPAINT_ZORDER_BELOW_CONTENT;

    memset(&pInfo->iidDrawObject, 0, sizeof(IID));

    pInfo->rcExpand.left = 0;
    pInfo->rcExpand.right = 0;
    pInfo->rcExpand.top = 0;
    pInfo->rcExpand.bottom = 0;

    return S_OK;
}

which tells MSHTML to call us first.

Now it's drawing time. This code is trivial. MSHTML calls our IHTMLPainter::Draw() method giving us the device context we should paint onto. All we do is draw our grid.

STDMETHODIMP CHTMLEditControlSite::XHTMLPainter::Draw(
                                       RECT rcBounds, 
                                       RECT /*rcUpdate*/, 
                                       LONG /*lDrawFlags*/, 
                                       HDC hdc, 
                                       LPVOID /*pvDrawObject*/)
{
    if (m_bGrid != FALSE)
    {
        HPEN redPen = (HPEN) CreatePen(PS_DOT, 0, RGB(0xff, 0x99, 0x99));
        HPEN oldPen = (HPEN) SelectObject(hdc, redPen);

        long lFirstLine = rcBounds.left + m_iGrid;

        for (int i = lFirstLine; i <= rcBounds.right; i += m_iGrid)
        {
            MoveToEx(hdc, i, rcBounds.top, NULL);
            LineTo(hdc, i, rcBounds.bottom);
        }

        lFirstLine = rcBounds.top + m_iGrid;

        for (i = lFirstLine ; i <= rcBounds.bottom; i += m_iGrid)
        {
            MoveToEx(hdc, rcBounds.left, i,  NULL);
            LineTo(hdc, rcBounds.right, i);
        }

        SelectObject(hdc, oldPen);
        DeleteObject(redPen);
    }

    return S_OK;
}

Using the code

In your CHTMLEditView derived view class header you need to add this function prototype. It's not documented in MSDN but fortunately it's a public virtual function.

virtual BOOL CreateControlSite(COleControlContainer* pContainer,
                                         COleControlSite** ppSite,
                                         UINT nID, REFCLSID clsid);

and add a CHTMLEditControlSite member variable to the data declarations in the header. I call it m_pEditSite. Add this function to your view implementation file.

BOOL CMyHTMLEditView::CreateControlSit(
                                         COleControlContainer* pContainer,
                                         COleControlSite** ppSite, UINT /* nID */,
                                         REFCLSID /* clsid */)
{
    ASSERT(ppSite != NULL);

    *ppSite = m_pEditSite = new CHTMLEditControlSite(pContainer);
    return TRUE;
}

At an appropriate place in your view class (maybe OnDownloadComplete()) add a call to m_pEditSite->InstallGrid().

void CMyHTMLEditView::OnDownloadComplete()
{
    CHtmlEditView::OnDownloadComplete();

    //  other code you need...

    m_pDoc = (IHTMLDocument2 *) GetHtmlDocument();
    m_pEditSite->InstallGrid(m_pDoc);
    m_pEditSite->Grid(TRUE);
}

Once you've got the grid installed for this document you can toggle it off and on by calling CHTMLEditControlSite::Grid() passing TRUE or FALSE. If you navigate to another document you must call CHTMLEditControlSite::InstallGrid() again to reinstall the grid handler.

AddRef() and Release() notes

If you look through the source code you may notice that in some places I handle AddRef() and Release() correctly whilst in other places I don't. This isn't laziness or lack of knowledge by me. The fact is that, unless I've seriously misunderstood COM reference counting, MSHTML doesn't seem to fully follow the reference counting rules. Sometimes it calls Release() on the interface pointers we gave it, sometimes it doesn't. Our CHTMLEditControlSite class is derived from CCmdTarget which implements reference counting on our behalf. As I discussed in my previous article MFC COM Interface Macros[^] the CCmdTarget destructor asserts (in debug builds) that the reference counter is less than or equal to 1. I found out by trial and error which interfaces ought to implement correct reference counting and which ones oughtn't. As it happens it doesn't matter that we're not correctly implementing COM reference counting given that our class is embedded as a member of the view class and won't go away until the view class goes away. One could argue that in this instance no interface within our class need implement correct reference counting but I prefer to do it the correct way whenever I can.

Notes on the demo program

The demo uses a hardcoded reference to an HTML page which, in turn has a hardcoded reference to the image file. You may need to tweak either the HTML page reference or the image reference within the HTML page.

History

25 April 2004 - Initial version.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here