Introduction
This is a quite small and fast XML DOM parser on the .NET platform (using .NET 2.0). The Main feature and main demand for this is not to use the System.Xml
namespace. Also tests shows amazing performance results compared with built-in .NET parsers.
Background
The idea of such a thing was born when one of my friends needed something to parse XML on C# without using the System.Xml
namespace. The writing and testing took about three hours. The parser doesn't support the entire XML specification but it is capable of parsing most XML I've tried.
Using the code
Usage of NanoXML is simple. You just need to add NanoXMLParser.cs to your project and use the TObject.Shared
namespace. The main top-level class is NanoXMLDocument
that parses XML and builds DOM.
NanoXML is not capable of loading data from files, so you may load an XML string with your application itself. Something like this:
FileStream fs = new FileStream(args[0], FileMode.Open, FileAccess.Read);
byte[] data = new byte[fs.Length];
fs.Read(data, 0, (int) fs.Length);
fs.Close();
string strData = Encoding.UTF8.GetString(data);
NanoXMLDocument xml = new NanoXMLDocument(strData);
Yes, NanoXML ignores XML declaration's encoding attribute Now, after we have loaded the document, we can get data for any Element or any of its attributes:
string myAttribute = xml.RootNode["Subnode"].GetAttribute("myAttribute");
NanoXML also ignores comments and DOCTYPE declarations. XML declarations (<?xml ?>
) will be parsed and stored in a NanoXMLDocument
object.
Performance
The most amazing thing in this parser is its performance. Before submitting the code here, I tried some benchmarks on the parser and compared it with built-in .NET parsers and I was surprised. All tests were performed on an 1.1 MB SVG file in string (i.e., without disk access overhead). Test results are shown below on the screenshot:
As we can see, NanoXML processes an 1.1 MB file almost immediately (17 ms). XmlDocument loaded document in 11 seconds. XmlReader (a SAX parser which by design should be much faster than DOM) reads the whole document for about 7 seconds. For the XmlReader, the test doesn't do anything but read file content from beginning to end. Benchmark test sources are available for download.
Points of Interest
This parser may be useful because of its great performance or when using built-in parsers (System.Xml
namespace) is forbidden.