Introduction
This article has two goals:
- It shows how to make your own XPathNavigator implementation and use it to evaluate
XPath expressions and apply XSLT transformation to the structures, not intended
to be used this way.
- It presents a new alternative way to work with files and
folders that some people may consider useful.
What is XPathNavigator
XPathNavigator
is the system abstract class that implements
an XPath
document model and provides a means of navigating through XML nodes and evaluating XPath expressions. Unlike XmlNode
or XNode
, XPathNavigator
is a cursor that may point to any node in the XML tree and be moved to another
node. XPathNavigator
is also used as an input for XslCompiledTransform
,
and therefore any implementation of the XPathNavigator
can be transformed
with an XSLT stylesheet.
XPathNavigator
implementations exist for all XML models in .NET, including
XmlDocument
and LINQ to XML. Generally an instance of an XPathNavigator
inheritor can be created for classes that implement an IXPathNavigable
interface. This interface contains a single method CreateNavigator
.
The classes XmlNode
and XPathDocument
(a special fast model
that only provides read-only access via the XPathNavigator
model) implement
IXPathNavigable
. However this is not always the case. The newest library
for working with XML - LINQ to XML - allows the creation of an XPathNavigator
with extension methods.
It's worth mentioning that none of the system-integrated implementations of the
XPathNavigator
are public.
How to implement XPathNavigator
XPathNavigator
contains 116 public members, 112 of which can be overridden.
The good news is that only 20 of them are abstract, i.e., must be implemented.
Here they are:
Properties
XmlNameTable NameTable
XPathNodeType NodeType
string LocalName
string Name
string NamespaceURI
string Prefix
string BaseURI
bool IsEmptyElement
Methods
XPathNavigator Clone()
bool MoveToFirstAttribute()
bool MoveToNextAttribute()
bool MoveToFirstNamespace(XPathNamespaceScope namespaceScope)
bool MoveToNextNamespace(XPathNamespaceScope namespaceScope)
bool MoveToNext()
bool MoveToPrevious()
bool MoveToFirstChild()
bool MoveToParent()
bool MoveTo(XPathNavigator other)
bool MoveToId(string id)
bool IsSamePosition(XPathNavigator other)
Implementing these 20 members is enough for complete a working read-only XPathNavigator
implementation. If you want to support modification of
the model, you need to override some more, at least those that throw the NotSupportedException
.
void SetValue(string value)
XmlWriter PrependChild()
XmlWriter AppendChild()
XmlWriter InsertAfter()
XmlWriter InsertBefore()
XmlWriter CreateAttributes()
XmlWriter ReplaceRange(XPathNavigator lastSiblingToReplace)
void DeleteRange(XPathNavigator lastSiblingToDelete)
As you can see, in order to implement a writable XPathNavigator
you must unfortunately implement XmlWriter
too.
In this article we will only need read-only functionality.
File system as XML
XPath model can be actually applied not only to XML, but to any tree-like structure
as the file system consists of files and directories. In our model we will treat files
and directories as XML elements. In this simple implementation I used file names
as names for XML elements. This is convenient because with such a solution XPath
expressions become very similar to file paths or local URIs. I didn't use any namespace
or prefix for file nodes for the same reason: should they have prefixes, the XPath
would become more cumbersome.
I rendered file properties, such as full name, extension, and file attributes as
XML attributes. For the sake of simplicity, there won't be any text node in this implementation
and no part of a file's content will be included in the tree.
Implementing XPathNavigator for file system
As I mentioned above XPathNavigator
is a cursor that can be moved around
the tree and point to any part of the tree, including elements, which are files
and folders for our task, and attributes. For this reason we will delegate all calls
in the XPathNavigator
to the internal instance of a subsidiary class.
There will be one class per type of node.
Since we decided not to deal with namespaces we will not need to implement some
members. So the following getters will just return String.Empty
:
public override string NamespaceURI { get { return String.Empty; } }
public override string Prefix { get { return String.Empty; } }
public override string BaseURI { get { return String.Empty; } }
The following methods will return false.
public override bool MoveToFirstNamespace(XPathNamespaceScope namespaceScope) { return false; }
public override bool MoveToNextNamespace(XPathNamespaceScope namespaceScope) { return false; }
The property Name
will return the same value as LocalName
:
public override string Name { return LocalName; }
We will not define the IDs for our nodes, so MoveToId
will also return false:
public override bool MoveToId(string id) { return false; }
Some of you might feel an uneasiness looking at the property NameTable
as it returns
an instance of the abstract class without any public implementation XmlNameTable
.
This property is called during the XSL transformation and is intended to increase
the performance
by means of comparing strings by instance rather than by value. Hopefully you can
safely return null from it.
public override XmlNameTable NameTable { get { return null; } }
Wow! We have already implemented 8 out of 20 mandatory members! The remaining 12
will be delegated to the aforementioned internal instance. You can consider it
the navigator's
State object. For methods that start with Move*
and return boolean, we will change
the signature: methods in the state will return the new instance of state linked to
the node at which the navigator will point after moving. If the navigator can't
be moved to the specified position, we will return null and the corresponding method
in the XPathNavigator
in this case will return false.
See the definition of the abstract class representing the internal state below:
internal abstract class XPathItem
{
public abstract string Name { get; }
public abstract XPathItem MoveToFirstAttribute();
public abstract XPathItem MoveToNextAttribute();
public abstract XPathItem MoveToFirstChild();
public abstract XPathItem MoveToNext();
public abstract XPathItem MoveToPrevious();
public abstract XPathNodeType NodeType { get; }
public virtual string Value { get { return string.Empty; } }
public abstract bool IsEmptyElement { get; }
public abstract XPathItem MoveToParent();
public abstract XPathItem MoveToAttribute(string name);
public abstract bool IsSamePosition(XPathItem item);
}
Now we can implement the rest of our XPathNavigator
.
public class FileSystemXPathNavigator : XPathNavigator
{
private XPathItem _item;
public FileSystemXPathNavigator(string fileName)
{
_item = RootItem.CreateItem(fileName);
}
private FileSystemXPathNavigator(XPathItem item)
{
_item = item;
}
public override XPathNavigator Clone()
{
return new FileSystemXPathNavigator(_item);
}
public override bool IsEmptyElement
{
get { return _item.IsEmptyElement ; }
}
public override bool IsSamePosition(XPathNavigator other)
{
var o = other as FileSystemXPathNavigator;
return o != null && o._item.IsSamePosition(_item);
}
public override string LocalName
{
get { return _item.Name; }
}
public override bool MoveTo(XPathNavigator other)
{
var o = other as FileSystemXPathNavigator;
if (o != null)
{
_item = o._item;
return true;
}
return false;
}
public override bool MoveToFirstAttribute()
{
return MoveToItem(_item.MoveToFirstAttribute());
}
private bool MoveToItem(XPathItem newItem)
{
if (newItem == null) return false;
_item = newItem;
return true;
}
public override bool MoveToFirstChild()
{
return MoveToItem(_item.MoveToFirstChild());
}
public override bool MoveToNext()
{
return MoveToItem(_item.MoveToNext());
}
public override bool MoveToNextAttribute()
{
return MoveToItem(_item.MoveToNextAttribute());
}
public override bool MoveToParent()
{
return MoveToItem(_item.MoveToParent());
}
public override bool MoveToPrevious()
{
return MoveToItem(_item.MoveToPrevious());
}
public override XPathNodeType NodeType
{
get { return _item.NodeType; }
}
public override string Value
{
get { return _item.Value; }
}
public override bool MoveToAttribute(string localName, string namespaceURI)
{
if (namespaceURI != String.Empty) return false;
return MoveToItem(_item.MoveToAttribute(localName));
}
}
Implementation of Working Objects
The following class hierarchy carries out the navigation through the file system and represents it as XML.
I don't want to go deep into the details of the implementation of each class. Just to cover the most interesting items:
- The
System.IO.FileSystemInfo
class is used as a source of file
system items attributes
protected internal abstract FileSystemInfo FileSystemInfo { get ; }
As file names allow much more special symbols than XML elements
do, encoding is required:
public override string Name
{
get { return XmlConvert.EncodeLocalName(FileSystemInfo.Name); }
}
The full path of files determines their equivalence:
public virtual string Path
{
get { return FileSystemInfo.FullName; }
}
public override bool IsSamePosition( XPathItem _item)
{
var fsi = _item as FileSystemItem;
return fsi != null && Path == fsi.Path;
}
Parent node, which in most cases correspond to a directory,
is responsible for getting the next and previous nodes. For this purpose the list of
files in this directory and the current index are stored in each DirectoryItem
instance.
Attribute items are flexible and can be positioned on any attribute
of a file or a directory. The AttributeInfo
class is a flyweight for each kind of attribute
(by and large they replicate properties of the FileSystemInfo
class). Flag attributes, such as Hidden, System etc., are considered false by default.
Console Utility
I also create a console utility xfiles, that uses FileSystemXPathNavigator
to evaluate
XPath expressions and make XSLT transformations on the file system. If executed with
/? argument, it shows the following information:
Evaluates XPath expression and performs XSLT trasformation on the local file system
Copyright: Boris Dongarov, 2012
Usage:
xfiles.exe [path] -xpath:<xpath-expression>
or
xfiles.exe [path] -xsl:<xslt-file>
where
path - a path to a file or a folder that will be a root of the XPath model (the current directory by default)
-xpath - an XPath expression to be applied to the file system
-xsl - a file name of the XSLT stylesheet to be applied to the file system
If neither option is specified, the XML will be rendered to the standard output
Some examples:
xfiles.exe
Shows the content of the current directory as XML.
xfiles.exe "C:\Program Files" -xpath:sum(//*[@Extension='.exe']/@Length)
Shows the total size of all executable files in the Program Files folder.
xfiles.exe "C:\Program Files" -xpath://*[@Temporary]/@FullName > tempfiles.txt
Writes the list of temporary files in the Program Files folder to the specified file.
xfiles C:\WINDOWS\SYSTEM32 -xsl:dirstructure.xslt > system32.htm
Transforms the content of the system directory to HTML using custom XSLT stylesheet.
Conclusion
I hope this article showed you the power of the XPathNavigator and will help you
to implement your own. Also you may find useful the utility xfiles and the approach
to working with files and directories that it provides. You can find all source
codes attached to this article.