Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

C++ Wrapper classes for the COM interfaces of Microsoft XML parser (MSXML)

0.00/5 (No votes)
1 Jan 2002 12  
The provided MFC extension DLL provides easy to use wrapper classes for all COM-interfaces of Microsoft's DOM-/SAX implementation.

Contents

  1. Introduction
  2. What is MSXMLCPP
  3. Concepts
  4. XML SAX wrapper nodes
  5. Drawbacks
  6. Requirements
  7. Unicode support
  8. How the wrapper classes have been generated
  9. The code examples

Introduction

No matter if you want to do interprocess communication, store your application's settings or documents or format text - XML provides a good solution for all those cases. Further on, Microsoft provides a full implementation of the XML DOM (Document Object Model) and SAX (Simple API for XML), which allows reading and writing XML in an abstract manner: The Microsoft XML parser (MSXML).

Besides the advantage of providing all features you will ever need when handling XML data the Microsoft XML parser also provides the advantage of being available on every future Windows system. The drawback for us, as C++ developers is, that the Microsoft XML parser provides all these features through COM-interfaces.

Using COM-interfaces in C++ always produces a lot of overhead in the source code - you have to handle the HRESULT return values, you have to provide variables for the logical return values, no matter if you are interested in it or not, because they are passed by the parameter list, you have to convert between zero terminated strings and BSTR. You have to take care about reference counting and so on.

Simply take a look at the two following code examples. Both are doing the same things: They are reading the value of an attribute of an XML element using Microsoft's implementation of the XML DOM:

C++
CString GetAttributeValue(IXMLDOMDocument *pDoc, CString strElementPath, 
                                              CString strAttribute)
{
  IXMLDOMNode         *pNode = NULL;
  IXMLDOMNamedNodeMap *pAttributes = NULL;
  BSTR                bstrPath = strElementPath.AllocSysString();
  BSTR                bstrAttribute = strAttribute.AllocSysString();

  try
  {
    HRESULT hr;
    
    hr = pDoc->selectSingleNode(strElementPath, &pNode);
    if (hr != S_OK)
      throw hr;
  
    hr = pNode->get_attributes(&pAttributes);
    if (hr != S_OK)
      throw hr;
      
    hr = pAttributes->getNamedItem(bstrAttribute, &pNode);
    if (hr != S_OK)
      throw hr;
      
    BSTR  bstrValue;
    pNode->Release();
    hr = pNode->get_text(&bstrValue);
    if (hr != S_OK)
      throw hr;
    
    pNode->Release();
    pAttributes->Release();
      
    return (LPCTSTR)_bstr_t(bstrValue, false);
  }
  catch (HRESULT hr)
  {
    SysFreeString(bstrPath);
    SysFreeString(bstrAttribute);
    if (pNode)
      pNode->Release();
    if (pAttributes)
      pAttributes->Release();
      
    return _T("");
  }
}
Visual Basic
Function GetAttributeValue(Doc As IXMLDOMDocument, _
        strElementPath As String, strAttribute As String) As String
    GetAttributeValue = Doc.selectSingleNode(strElementPath)._
                     attributes.getNamedItem(strAttribute).text
End Function

It's nearly unbelievable: Both code examples are doing the same. OK, admittedly the C++ example does a better error handling ;-). But the differences shown here are making clear, why so many people are writing there own XML-classes, instead of using the Microsoft XML parser - the usage is much to unwieldy.

What is MSXMLCPP

That is where MSXMLCPP (MSXML C++) comes in. MSXMLCPP is a MFC extension DLL which provides wrapper classes for all the interfaces of the Microsoft DOM and SAX implementation.

The features are:

  • C++ wrapper classes for all the interfaces of the Microsoft DOM and SAX implementation.
  • C++ wrapper classes for easier implementation of interfaces you have to pass to the Microsoft implementation (i.e. ISAXContentHandler)
  • Smart reference counting for wrapped interfaces.
  • Wrapper methods are returning the real (logical) return values instead of HRESULT.
  • C++ exceptions for reporting unexpected HRESULTs.
  • Automatic translation of BSTR to CString and vice versa.
  • Automatic translation of VARIANT to the more comfortable _variant_t.

If you have ever dreamed of writing:

CString GetAttributeValue(CXMLDOMDocument &Doc, 
           CString strElementPath, CString strAttribute)
{
  return Doc.SelectSingleNode(strElementPath).GetAttributes().
                         GetNamedItem(strAttribute).GetText();
}

in C++, instead of the multi-lined source example given above, then: Hey, welcome to your solution.

Concepts

This section describes the basic concepts of MSXMLCPP.

Wrapper Types

MSXMLCPP provides three kinds of wrapper classes:

  1. Calling wrappers
  2. Implementation wrappers
  3. Coclass wrappers

Calling wrappers

Calling wrappers are classes you can use to invoke methods of the interface wrapped by the specific class. All calling wrappers are derived from the template class CInterfaceCallingWrapper (defined in InterfaceWrapper.h).

Before using calling wrappers you have to attach an interface to an instance of the class. This can be done using a constructor, the assignment operator or explicitly using the Attach() method. Before assigning a new interface pointer to a wrapper, you always have to detach a previously attached one by calling the Detach() method.

The calling wrapper classes are handling the reference counting for the attached interface pointers, so that you won't have to care about it with normal usage. The classes are providing several operator overridings that allow you to use the wrapper class everywhere, instead of the original interface pointer. For more information about the calling wrappers, take a look at the well documented declaration of CInterfaceCallingWrapper in InterfaceWrapper.h

A calling wrapper class provides all the methods and properties provided by the belonging COM interface, but there are some slight differences:

  • The method names have been adapted to the MFC conventions (beginning with a capital letter).
  • The methods for getting a property are prefixed with a Get and the methods for setting a property are prefixed with a Set or SetRef.
  • The method return values are reflecting the logical result of the operation (MIDL-[retval] attribute).
  • BSTR parameters are replaced by LPCTSTR, BSTR* parameters are replaced by CString& and BSTR return values are replaced by CString return values.
  • VARIANT parameters are replaced by _variant_t, VARIANT* parameters are replaced by _variant_t& and VARIANT return values are replaced by _variant_t return values.
  • If an interface's method returns a pointer to an interface for which a wrapper class has been defined, the wrapping method returns an object of this wrapper class instead.

Because the calling wrapper classes are not returning any HRESULT values, they are using the C++ exception mechanism to inform you about unexpected values returned by the original interface methods. Each time a HRESULT value is received for which the SUCCEEDED macro does not return TRUE, an exception of the type CComException* is thrown.

Implementation wrappers

Implementation wrappers are classes that allow you to implement the belonging interface simply by deriving your class from the implementation wrapper class and overriding the high level pure virtual functions. The implementation wrappers are derived from CInterfaceImplementationWrapper or CDispatchInterfaceImplementationWrapper, depending on if there related interfaces are derived from IUnknown or IDispatch.

Implementation wrapping is simply done by aggregation: The implementation wrapper class contains a member (m_x...) that is an implementation of the original interface. Each method of this implementation calls the virtual method MethodPrologue() of the aggregating class and afterwards the belonging virtual wrapper method of the aggregating class. Those wrapper methods are pure virtual methods you have to implement in your derived class. They are identical with the interface's original method, apart from some changes:

  • The method names have been adapted to the MFC conventions (beginning with a capital letter).
  • The methods for getting a property are prefixed with a Get and the methods for setting a property are prefixed with a Set or SetRef.
  • The method return values are reflecting the logical result of the operation (MIDL-[retval] attribute).
  • BSTR parameters are replaced by LPCTSTR, BSTR* parameters are replaced by CString& and BSTR return values are replaced by CString return values.
  • VARIANT parameters are replaced by _variant_t, VARIANT* parameters are replaced by _variant_t& and VARIANT return values are replaced by _variant_t return values.

This makes implementation much more easier than implementing the original interfaces with all their BSTRs, VARIANTs and HRESULTs.

If your are implementing an interface based on the implementation wrappers, as a stand alone class, the standard mechanisms for reference counting implemented in CInterfaceImplementationWrapper should be OK, but if you want to use your interface implementation as part of a class that implements more interfaces or as part of a CCmdTarget derived class, you will have to modify the reference counting by overriding the virtual AddRef() and Release() methods. Further on you can override the QueryInterface() method and the MethodPrologue() method, which will be necessary in most MFC applications to activate the right module state (i.e. calling AFX_MANAGE_STATE(AfxGetStaticModuleState()) would be a good idea).

Because you do not have the possibility of signaling errors by a HRESULT return value in your derived classes, you can throw a CComException by calling the global AfxThrowComException() - All the method calls in the implementation wrapper are encapsulated by a try statement with a catch(CComException *pE) chained to. If you throw a CComException in your implementation, the implementation wrapper will return the given HRESULT value to the caller instead of the usual S_OK.

CoClass wrappers

The CoClass wrappers are small wrapper classes which allow you to create COM objects and retrieve the specified interfaces. This frees you from calling CoCreateInstance().

CComException

Because the HRESULT values are encapsulated by the wrapper classes, it was necessary to provide a mechanism that allows your software to handle unexpected HRESULTs. Therefore the class CComException has been declared in InterfaceWrapper.h. The class is derived from CException and contains a single attribute m_hr of the type HRESULT.

To throw a CComException you should use the static CComException::Throw() method or the global AfxThrowComException() function. Both are creating a CComException object on the heap, setting its m_hr attribute to the specified value and throwing the pointer.

If you have caught a pointer to a CComException object and you do not need it any more, you should call its Delete() method to destroy it.

XML SAX wrapper nodes

For the SAX implementation, Microsoft provides two kinds of interfaces: the C++ interfaces and the Visual Basic interfaces. Because I have generated all the wrapper classes with a tool, that analysis the parameters of each method to generate the method wrapper, I needed to use the Visual Basic interfaces to get "nice" wrapper classes. The problem with the C++ interfaces (which may be a little bit faster) is, that they are using two parameters for each string: the string itself and a length argument. I did not want to modify the generation tool to recognize this, because this case is very seldom and so I used the somehow slower Visual Basic interfaces, which are using normal BSTRs, which are recognized by the generation tool.

Drawbacks

As everything good, even MSXMLCPP has a drawback: The wrapper classes are producing some time-overhead for all the type conversion and additional function calls. So if your application has to be very fast, it may be better to work on the raw interfaces, but I think the time-overhead should be that small, to justify the advantage of more readable code.

Requirements

To use the MSXMLCPP library with your application, simply include the msxmlcpp.h header file into your stdafx.h (or anywhere else) and link to the msxmlcpp.lib (msxmlcppD.lib for debug version). MSXMLCPP uses RTTI (Runtime Type Information), so it would be a good idea to enable it for your application too.

To run an application using the MSXMLCPP DLL, the Microsoft XML parser (MSXML) has to be installed on the system. You can download it from Microsoft homepage.

This should run on Windows 95 or later and Windows NT 3.1 or later.

Unicode support

Though I haven't tested it, there is no reason, why this should not work with Unicode.

How the wrapper classes have been generated

All the wrapper classes have been generated with a tool of mine, that gets a typelibrary and some rules as input and generates the wrapper classes out of the information in the typelibrary. I hope I will be able to provide a first free beta of this tool in a few weeks, on CodeProject. I think this will make the work for all of us C++ programmers, much more easier.

The code examples

Two code examples, which are using the library are supported with the full download. Both examples are simple console applications, to concentrate on the interesting stuff. Both examples are expecting a file name as command line parameter to be used as XML file - a valid file (books.xml) is contained in the Output directory of the ZIP-archive.

The application DOMXMLDemo, demonstrates how to read and write an XML file using the DOM wrapper classes.

The application SAXXMLDemo demonstrates how to read XML files using the SAX wrapper classes. This also shows how to implement interfaces using implementation wrapper classes.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here