Contents
- Introduction
- What is MSXMLCPP
- Concepts
- XML SAX wrapper nodes
- Drawbacks
- Requirements
- Unicode support
- How the wrapper classes have been generated
- The code examples
Introduction
No matter if you want to do interprocess communication, store your application's settings or documents or format text - XML provides a good solution for all those cases. Further on, Microsoft provides a full implementation of the XML DOM (Document Object Model) and SAX (Simple API for XML), which allows reading and writing XML in an abstract manner: The Microsoft XML parser (MSXML).
Besides the advantage of providing all features you will ever need when handling XML data the Microsoft XML parser also provides the advantage of being available on every future Windows system. The drawback for us, as C++ developers is, that the Microsoft XML parser provides all these features through COM-interfaces.
Using COM-interfaces in C++ always produces a lot of overhead in the source code - you have to handle the HRESULT
return values, you have to provide variables for the logical return values, no matter if you are interested in it or not, because they are passed by the parameter list, you have to convert between zero terminated strings and BSTR
. You have to take care about reference counting and so on.
Simply take a look at the two following code examples. Both are doing the same things: They are reading the value of an attribute of an XML element using Microsoft's implementation of the XML DOM:
C++
CString GetAttributeValue(IXMLDOMDocument *pDoc, CString strElementPath,
CString strAttribute)
{
IXMLDOMNode *pNode = NULL;
IXMLDOMNamedNodeMap *pAttributes = NULL;
BSTR bstrPath = strElementPath.AllocSysString();
BSTR bstrAttribute = strAttribute.AllocSysString();
try
{
HRESULT hr;
hr = pDoc->selectSingleNode(strElementPath, &pNode);
if (hr != S_OK)
throw hr;
hr = pNode->get_attributes(&pAttributes);
if (hr != S_OK)
throw hr;
hr = pAttributes->getNamedItem(bstrAttribute, &pNode);
if (hr != S_OK)
throw hr;
BSTR bstrValue;
pNode->Release();
hr = pNode->get_text(&bstrValue);
if (hr != S_OK)
throw hr;
pNode->Release();
pAttributes->Release();
return (LPCTSTR)_bstr_t(bstrValue, false);
}
catch (HRESULT hr)
{
SysFreeString(bstrPath);
SysFreeString(bstrAttribute);
if (pNode)
pNode->Release();
if (pAttributes)
pAttributes->Release();
return _T("");
}
}
Visual Basic
Function GetAttributeValue(Doc As IXMLDOMDocument, _
strElementPath As String, strAttribute As String) As String
GetAttributeValue = Doc.selectSingleNode(strElementPath)._
attributes.getNamedItem(strAttribute).text
End Function
It's nearly unbelievable: Both code examples are doing the same. OK, admittedly the C++ example does a better error handling ;-). But the differences shown here are making clear, why so many people are writing there own XML-classes, instead of using the Microsoft XML parser - the usage is much to unwieldy.
What is MSXMLCPP
That is where MSXMLCPP (MSXML C++) comes in. MSXMLCPP is a MFC extension DLL which provides wrapper classes for all the interfaces of the Microsoft DOM and SAX implementation.
The features are:
- C++ wrapper classes for all the interfaces of the Microsoft DOM and SAX implementation.
- C++ wrapper classes for easier implementation of interfaces you have to pass to the Microsoft implementation (i.e.
ISAXContentHandler
)
- Smart reference counting for wrapped interfaces.
- Wrapper methods are returning the real (logical) return values instead of
HRESULT
.
- C++ exceptions for reporting unexpected
HRESULT
s.
- Automatic translation of
BSTR
to CString
and vice versa.
- Automatic translation of
VARIANT
to the more comfortable _variant_t
.
If you have ever dreamed of writing:
CString GetAttributeValue(CXMLDOMDocument &Doc,
CString strElementPath, CString strAttribute)
{
return Doc.SelectSingleNode(strElementPath).GetAttributes().
GetNamedItem(strAttribute).GetText();
}
in C++, instead of the multi-lined source example given above, then: Hey, welcome to your solution.
Concepts
This section describes the basic concepts of MSXMLCPP.
Wrapper Types
MSXMLCPP provides three kinds of wrapper classes:
- Calling wrappers
- Implementation wrappers
- Coclass wrappers
Calling wrappers
Calling wrappers are classes you can use to invoke methods of the interface wrapped by the specific class. All calling wrappers are derived from the template class CInterfaceCallingWrapper
(defined in InterfaceWrapper.h).
Before using calling wrappers you have to attach an interface to an instance of the class. This can be done using a constructor, the assignment operator or explicitly using the Attach()
method. Before assigning a new interface pointer to a wrapper, you always have to detach a previously attached one by calling the Detach()
method.
The calling wrapper classes are handling the reference counting for the attached interface pointers, so that you won't have to care about it with normal usage. The classes are providing several operator overridings that allow you to use the wrapper class everywhere, instead of the original interface pointer. For more information about the calling wrappers, take a look at the well documented declaration of CInterfaceCallingWrapper
in InterfaceWrapper.h
A calling wrapper class provides all the methods and properties provided by the belonging COM interface, but there are some slight differences:
- The method names have been adapted to the MFC conventions (beginning with a capital letter).
- The methods for getting a property are prefixed with a
Get
and the methods for setting a property are prefixed with a Set
or SetRef
.
- The method return values are reflecting the logical result of the operation (MIDL-[retval] attribute).
BSTR
parameters are replaced by LPCTSTR
, BSTR*
parameters are replaced by CString&
and BSTR
return values are replaced by CString
return values.
VARIANT
parameters are replaced by _variant_t
, VARIANT*
parameters are replaced by _variant_t&
and VARIANT
return values are replaced by _variant_t
return values.
- If an interface's method returns a pointer to an interface for which a wrapper class has been defined, the wrapping method returns an object of this wrapper class instead.
Because the calling wrapper classes are not returning any HRESULT
values, they are using the C++ exception mechanism to inform you about unexpected values returned by the original interface methods. Each time a HRESULT
value is received for which the SUCCEEDED
macro does not return TRUE
, an exception of the type CComException* is thrown.
Implementation wrappers
Implementation wrappers are classes that allow you to implement the belonging interface simply by deriving your class from the implementation wrapper class and overriding the high level pure virtual functions. The implementation wrappers are derived from CInterfaceImplementationWrapper
or CDispatchInterfaceImplementationWrapper
, depending on if there related interfaces are derived from IUnknown
or IDispatch
.
Implementation wrapping is simply done by aggregation: The implementation wrapper class contains a member (m_x...
) that is an implementation of the original interface. Each method of this implementation calls the virtual method MethodPrologue()
of the aggregating class and afterwards the belonging virtual wrapper method of the aggregating class. Those wrapper methods are pure virtual methods you have to implement in your derived class. They are identical with the interface's original method, apart from some changes:
- The method names have been adapted to the MFC conventions (beginning with a capital letter).
- The methods for getting a property are prefixed with a
Get
and the methods for setting a property are prefixed with a Set
or SetRef
.
- The method return values are reflecting the logical result of the operation (MIDL-[retval] attribute).
BSTR
parameters are replaced by LPCTSTR
, BSTR*
parameters are replaced by CString&
and BSTR
return values are replaced by CString
return values.
VARIANT
parameters are replaced by _variant_t
, VARIANT*
parameters are replaced by _variant_t&
and VARIANT
return values are replaced by _variant_t
return values.
This makes implementation much more easier than implementing the original interfaces with all their BSTR
s, VARIANT
s and HRESULT
s.
If your are implementing an interface based on the implementation wrappers, as a stand alone class, the standard mechanisms for reference counting implemented in CInterfaceImplementationWrapper
should be OK, but if you want to use your interface implementation as part of a class that implements more interfaces or as part of a CCmdTarget
derived class, you will have to modify the reference counting by overriding the virtual AddRef()
and Release()
methods. Further on you can override the QueryInterface()
method and the MethodPrologue()
method, which will be necessary in most MFC applications to activate the right module state (i.e. calling AFX_MANAGE_STATE(AfxGetStaticModuleState()
) would be a good idea).
Because you do not have the possibility of signaling errors by a HRESULT
return value in your derived classes, you can throw a CComException by calling the global AfxThrowComException()
- All the method calls in the implementation wrapper are encapsulated by a try
statement with a catch(CComException *pE)
chained to. If you throw a CComException
in your implementation, the implementation wrapper will return the given HRESULT
value to the caller instead of the usual S_OK
.
CoClass wrappers
The CoClass wrappers are small wrapper classes which allow you to create COM objects and retrieve the specified interfaces. This frees you from calling CoCreateInstance()
.
CComException
Because the HRESULT
values are encapsulated by the wrapper classes, it was necessary to provide a mechanism that allows your software to handle unexpected HRESULT
s. Therefore the class CComException
has been declared in InterfaceWrapper.h. The class is derived from CException
and contains a single attribute m_hr
of the type HRESULT
.
To throw a CComException
you should use the static CComException::Throw()
method or the global AfxThrowComException()
function. Both are creating a CComException
object on the heap, setting its m_hr
attribute to the specified value and throwing the pointer.
If you have caught a pointer to a CComException
object and you do not need it any more, you should call its Delete()
method to destroy it.
XML SAX wrapper nodes
For the SAX implementation, Microsoft provides two kinds of interfaces: the C++ interfaces and the Visual Basic interfaces. Because I have generated all the wrapper classes with a tool, that analysis the parameters of each method to generate the method wrapper, I needed to use the Visual Basic interfaces to get "nice" wrapper classes. The problem with the C++ interfaces (which may be a little bit faster) is, that they are using two parameters for each string: the string itself and a length argument. I did not want to modify the generation tool to recognize this, because this case is very seldom and so I used the somehow slower Visual Basic interfaces, which are using normal BSTR
s, which are recognized by the generation tool.
Drawbacks
As everything good, even MSXMLCPP has a drawback: The wrapper classes are producing some time-overhead for all the type conversion and additional function calls. So if your application has to be very fast, it may be better to work on the raw interfaces, but I think the time-overhead should be that small, to justify the advantage of more readable code.
Requirements
To use the MSXMLCPP library with your application, simply include the msxmlcpp.h header file into your stdafx.h (or anywhere else) and link to the msxmlcpp.lib (msxmlcppD.lib for debug version). MSXMLCPP uses RTTI (Runtime Type Information), so it would be a good idea to enable it for your application too.
To run an application using the MSXMLCPP DLL, the Microsoft XML parser (MSXML) has to be installed on the system. You can download it from Microsoft homepage.
This should run on Windows 95 or later and Windows NT 3.1 or later.
Unicode support
Though I haven't tested it, there is no reason, why this should not work with Unicode.
How the wrapper classes have been generated
All the wrapper classes have been generated with a tool of mine, that gets a typelibrary and some rules as input and generates the wrapper classes out of the information in the typelibrary. I hope I will be able to provide a first free beta of this tool in a few weeks, on CodeProject. I think this will make the work for all of us C++ programmers, much more easier.
The code examples
Two code examples, which are using the library are supported with the full download. Both examples are simple console applications, to concentrate on the interesting stuff. Both examples are expecting a file name as command line parameter to be used as XML file - a valid file (books.xml) is contained in the Output directory of the ZIP-archive.
The application DOMXMLDemo, demonstrates how to read and write an XML file using the DOM wrapper classes.
The application SAXXMLDemo demonstrates how to read XML files using the SAX wrapper classes. This also shows how to implement interfaces using implementation wrapper classes.