Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Custom XPath Functions

0.00/5 (No votes)
28 Aug 2006 2  
Using custom functions to extend XPath expressions

Introduction

XPath Expressions are a powerful tool, however there are some limitations. Some things that you may need to do are not available yet in the specification, or you may need to do something slightly different from the specification, or there just isn't any way to do it except for using custom code. XPath Functions are a means to providing extra functionality to solve the problems that are not covered by any other means.

For this discussion, we will focus on the problem of performing case-insensitive searches in XPath expressions. The technique of creating and using custom functions and variables can certainly be applied to extend your XPath expression as necessary to solve other problems.

The Problem

XML is a case sensitive language, and although this can be a good thing, sometimes it provides for frustration. Validating the XML can be used to ensure proper formatting yet sometimes this is not possible, either because there is no schema available or you may not have control of the XML format, you just get what you get and have to make it work. When attempting to select nodes using a XPath expression, there is a difference between “*//Address[@type = ‘home’]” and “*//Address[@type = ‘Home’]”. Given the XML snippet below, only one node would be returned for either of these queries.

    <Addresses>
      <Address type="Home">
        <Street>100 Main St</Street>
        <City>Uppercase</City>
        <State>AZ</State>
        <Zipcode>12345</Zipcode>
      </Address>
      <Address type="home">
        <Street>100 Main St</Street>
        <City>Lowercase</City>
        <State>az</State>
        <Zipcode>12345</Zipcode>
      </Address>
      <Address type="business">
        <Street>1 Business Way</Street>
        <City>Lowercase</City>
        <State>AZ</State>
        <Zipcode>12345</Zipcode>
      </Address>
    </Addresses>

But what if you need to find all the nodes, regardless of casing? One way to do this type of matching would be to iterate through a node list and filter the results as below:

XPathDocument doc = new XPathDocument("Test.xml");
XPathNavigator nav = doc.CreateNavigator();

XPathNodeIterator nodes = nav.Select("*//Address");
foreach(XPathNavigator node in nodes)
{
    string attr = node.GetAttribute("type", "");
    if(attr.ToLower() == "home" )
   {
       // Do something...
    }
}

This is inefficient and cumbersome since you must retrieve the nodes, iterate through them, and filter out the ones that don't match. A better way would be to filter the results that are returned in the first place.

XsltContext

The .NET Framework supports the ability to add custom functions to your XPath expressions by specifying a XsltContext. This abstract class provides a context for the XSLT processor to resolve any functions, variables and namespaces used in the XPath expression. When deriving a class from XsltContext you must implement four methods and one property. The two most important methods are ResolveFunction and ResolveVariable. Although they have a purpose and usage during my explorations I have not found a usage for the remaining methods and properties.

public override IXsltContextFunction ResolveFunction(string prefix, 
				string name, XPathResultType[] ArgTypes) 
public override IXsltContextVariable ResolveVariable(string prefix, string name) 

ResolveFunction

When using an XPath expression that contains a function, such as below, the processor must be able to resolve the function in its context.

"*//Address[compare(string(@type),'home')]" 

The ResolveFunction method is used to return an implementation of IXsltContextFunction that corresponds to the name that is passed in. The prefix parameter is of course the namespace prefix that may be associated with the function name. The ArgTypes parameter is an array of the types for each parameter being used in the function. In the case of the above function it will have two elements of type string.

/// <SUMMARY>
/// Override for XsltContext method
/// </SUMMARY>
/// <param name="prefix">Namespace prefix for function</param>
/// <param name="name">Name of function</param>
/// <param name="ArgTypes">Array of XPathResultType</param>
/// <RETURNS>Implementation of IXsltContextFunction</RETURNS>
public override IXsltContextFunction ResolveFunction(string prefix, 
string name, XPathResultType[] ArgTypes)
{
    IXsltContextFunction func = null;

    switch(name)
    {
        case "compare":
            func = new CompareFunction();
            break;
        default:
            break;
    }
    return func;
}

IXsltContextFunction

The ResolveFunction method above returns an implementation of the IXsltContextFunction. This interface has one method and four properties that must be implemented. I have not found a case where the properties were used, though ReturnType is accessed for each parameter in the function. The Invoke method is what is called during processing of the XPath expression.

public object Invoke(XsltContext xsltContext, object[] args, 
XPathNavigator docContext)
{
    if(args.Length != 2)
        throw new ApplicationException("Two arguments must be    
                                       provided to compare function.");
    
    string Arg1 = args[0].ToString();
    string Arg2 = args[1].ToString();
    
    if(String.Compare(Arg1, Arg2, true) == 0)
        return true;
    else
        return false;
}

This method is where the necessary work for the function takes place. In this case, after verifying that both parameters are available, a case insensitive comparison is made. If the strings match, the method returns true so that the node is included on the nodeset for the select.

XPathDocument doc = new XPathDocument("Test.xml");
XPathNavigator nav = doc.CreateNavigator();

// Create a custom context for the XPathExpression
CustomContext ctx = new CustomContext();
string XPath =string.Format("*//Address[compare(string(@type),'{0}')]", type);

// Create an XPathExpression
XPathExpression exp = nav.Compile(XPath);

// Set the context to resolve the function
// ResolveFunction is called at this point
exp.SetContext(ctx);

// Select nodes based on the XPathExpression
// IXsltContextFunction.Invoke is called for each
// node to filter the resulting nodeset
XPathNodeIterator nodes = nav.Select(exp);

ResolveVariable

If you have used a variable in your XPath expression, such as this...

XPathExpression exp = nav.Compile("*//Address[compare(string(@type))=$value]");

... it will be resolved by calling the ResolveVariable method of the XsltContext derived class.

/// <SUMMARY>
/// Override for XsltContext method used to resolve variables
/// </SUMMARY>
/// <param name="prefix">Namespace prefix for variable</param>
/// <param name="name">Name of variable</param>
/// <RETURNS>CustomVariable</RETURNS>
public override IXsltContextVariable ResolveVariable(string prefix, 
string name)
{
    return new CustomVariable(name);
}

A CustomVariable class, which implements IXsltCustomVariable is returned. IXsltCustomVariable interface has only one method to implement. The Evaluate method is called at runtime to retrieve the value of the specified variable.

/// <SUMMARY>
/// Gets the value of the variable specified
/// </SUMMARY>
/// <param name="xsltContext">Context in which this variable is used</param>
/// <RETURNS>Value of the variable</RETURNS>
public object Evaluate(XsltContext xsltContext)
{
    XsltArgumentList args = ((CustomContext)xsltContext).ArgumentList;
    return args.GetParam(Name, "");
}

The variable can then be used in the IXsltContextFunction.Invoke:

public object Invoke(XsltContext xsltContext, object[] args, 
XPathNavigator docContext)
{
    string Value = 
   ((CustomContext)xsltContext).ArgumentList.GetParam("value", "").ToString();

    string Arg1 = args[0].ToString();

    if(String.Compare(Arg1, Value, true) == 0)
        return true;
    else
        return false;
}

A difference between using variables in your XPath expression and not is when the IXsltContextFunction.Invoke method is called. When using variables, this method is called while iterating over the node-set, however when not using variables it is called when the Select is executed.

Conclusion

This article has demonstrated using a custom XPath function to solve a simple problem. You should be able to use the techniques to extend the capabilities of XPath in your application.

History

  • 28th August, 2006: Initial post

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here