Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

XSLT Extension Functions

0.00/5 (No votes)
30 Jan 2004 2  
Extending the functionality of the XSLT Processor.

Introduction

XML and XSLT have done much in the way to ease processing chunks of data. No longer must we write custom parsers for each 'fixed field' layout we encounter which can become a maintenance nightmare. With XSLT, we can transform XML into another shape, for presentation or for another consumer process. When standard XSLT functionality is not enough, we can extend its reach by adding our own custom components.

The type of functionality that can be introduced is not limited but one must be diligent in its use. In my opinion, it would be unwise to drive business logic from a stylesheet. There are other scenarios, however, where this may be practical: data that needs to be translated against a database or a complex calculation that exceeds the limitations of the built-in math functionality.

The Sample Code

The sample is made up of a WTL test client, an XSLT Transform component and an extension function component to do the database lookup. I have included a subset of the Bibliography Access database as a data source, a stylesheet and a few input files.

The Extension Function

The extension function is a regular COM component that must support IDispatch, but there are a few things to keep in mind.

  • In XML, everything is a string.
  • Numbers can be coerced using the number function for input parameters. The XSLT Processor will convert numeric output parameters into strings.
  • XML can be passed back to the processor. You must use the disable-output-escaping attribute of the xsl:value-of tag in order to have valid XML. If not, you'll end up with an un-parsable string.

The sample extension function exposes 3 public methods and 1 property. Each has differing parameter lists to show some of the nuances when making calls outside of the XSLT Processor.

interface IBibTitles : IDispatch{
    [id(1), helpstring("method GetBookTitle")] 
HRESULT GetBookTitle([in] BSTR isbn, [out,retval] BSTR* bookTitle);
    [id(2), helpstring("method GetYearPublished")] 
HRESULT GetYearPublished([in] BSTR isbn, [out,retval] LONG* yearPublished);
    [id(3), helpstring("method GetIsbnByYear")] 
HRESULT GetIsbnByYear([in] LONG year, [out,retval] BSTR* result);
    [propget, id(4), helpstring("property ProcessTime")] 
HRESULT ProcessTime([out, retval] BSTR* pVal);
};

Each method queries the database and the return value is written directly to the XML output stream. If the query does not return any rows, the returned value is NULL or in the case of returning a LONG, it is defaulted to -1. Any errors raised by this component are re-thrown from the XSLT Processor, thereby causing the transformation to fail. GetIsbnByYear queries for 0 or more rows and returns this as an XML string. This is accomplished by using CXmlAccessor. The property, ProcessTime, returns the current system time.

Other than that, there's nothing special about this component. It can be written in any language that supports COM: VC++, Visual Basic or any .NET language. If you use a .NET language, you will need to register the assembly's type library (Regasm.exe).

The XSLTransform Component

In order to get an XSLT Processor, load the stylesheet into a FreeThreadedDOMDocument40 object, create a XSLTemplate40 object, then set the stylesheet property of the XSLTemplate to the DOMDocument. The stylesheet is complied once, is read-only, and can be used concurrently by other threads to perform the transform.

I create the Free Threaded Document, XSLTemplate and the extension function in FinalConstruct() and these are available as long as this component is in memory. In each call to the Transform method, I query the XSLTemplate for a processor, add the extension function, then transform the input.

// get a processor

MSXML2::IXSLProcessorPtr pProc;
pProc = pTemplate->createProcessor();
pProc->addObject(pExtFunc, _T("http://myuniquenamespace/titles"));
pProc->input = pDom.GetInterfacePtr();

pProc->transform();

Notice the call to addObject. The second parameter is the namespace for this extension. It can be anything you like but it's a good idea to try to keep it unique. If you've ever implemented IScritpHost then this may look familiar. We are adding an object and specifying a name in order to call it from script. You can add as many objects to the processor as you like, each with a different namespace.

The Stylesheet

In the stylesheet, we must declare the object namespace in the stylesheet or transform tag. The namespace prefix can be anything you like, here I used 'db'. The namespace must be identical to that used in the call to addObject.

<xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:db="http://myuniquenamespace/titles">

Call the extension function in the select attribute of the xsl:value-of tag. Keep in mind the parameter types of the method calls. The following will produce a Type Mismatch error.

<xsl:value-of select="db:GetBookTitle(ROOT/BK)"/>

This is due to the fact that an XPath expression evaluates to an XML tree fragment. Our extension component doesn't know anything about XML, just strings and numbers. We can fix this by using the string and number functions.

<xsl:value-of select="db:GetBookTitle(string(ROOT/BK))"/>

If the XPath expression evaluates to a node that has children, the string value will be a space-delimited string of all text under that node.

XML data
<ROOT>
    <BK>0-0230081-2-1</BK>
    <BK>0-0230948-1-8</BK>
    <BK>0-0301413-4-6</BK>
</ROOT>

XPath:
<xsl:value-of select="string(ROOT)"/>
Evaluates to:
0-0230081-2-1 0-0230948-1-8 0-0301413-4-6

Keep this in mind when you're debugging.

If your function exposes properties, they can be referenced using get-[Property] and set-[Property].

<xsl:value-of select="db:get-ProcessTime()"/>

Summary

That about does it. The only thing left is the test client but there's nothing special happening there.

Using the XSLTemplate and XSLProcessor will increase performance by compiling the stylesheet once and reusing it each time a transformation is performed, much like a stored procedure.

An XSLT Processor extension function is a regular COM component that supports IDispatch. Everything in XML is a string, so pay special attention to parameter types.

Included with the sample code are 3 input files. The first, input1.xml, has a few ISBN numbers that are present in the database. Input2.xml has 1 ISBN number that is not in the database and a tag that will cause the stylesheet to call GetIsbnByYear. The last input file, input3.xml, is invalid XML. I used this to test failure conditions as well as changed the stylesheet to force failure when it loads.

With .NET on the move, MSXML4 may already be outdated. Whether or not that is the case, I think it's a good idea to know what's going on behind the scenes. Although I don't believe that extending the processor will always be the best overall choice, it's nice to have when you need it.

Points of Interest

While I have not spent lots of time exploring the full XML feature set provided by .NET, I'm sure that it has improvements over MSXML 4. I've spent a number of hours trying to pass an XML fragment into the extension function with no success.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here