Introduction
This beginners tutorial shows four different ways to represent the same data in XML and how to select that data using XPath. The data represented is the page size of a census recording. The page size depends on the country and the year. Also, there are two sizes (and they may be the same size) for the page, a large size and a small size.
="1.0"="utf-8"
<STUFF>
<TYPE1>
<CENSUS COUNTRY="USA" YEAR="1930">
<PAGE SIZE="SMALL">17x11</PAGE>
<PAGE SIZE="LARGE">27x19</PAGE>
</CENSUS>
<CENSUS COUNTRY="USA" YEAR="1880">
<PAGE SIZE="SMALL">17x11</PAGE>
<PAGE SIZE="LARGE">19x25</PAGE>
</CENSUS>
<CENSUS COUNTRY="UK" YEAR="1871">
<PAGE SIZE="SMALL">9.5x15</PAGE>
<PAGE SIZE="LARGE">9.5x15</PAGE>
</CENSUS>
<CENSUS COUNTRY="UK" YEAR="1891">
<PAGE SIZE="SMALL">11x16</PAGE>
<PAGE SIZE="LARGE">11x16</PAGE>
</CENSUS>
</TYPE1>
<TYPE2>
<CENSUS>
<COUNTRY>USA</COUNTRY>
<YEAR>1930</YEAR>
<PAGE>
<SIZE>
<SMALL>17x11</SMALL>
<LARGE>27x19</LARGE>
</SIZE>
</PAGE>
</CENSUS>
<CENSUS>
<COUNTRY>USA</COUNTRY>
<YEAR>1880</YEAR>
<PAGE>
<SIZE>
<SMALL>17x11</SMALL>
<LARGE>19x25</LARGE>
</SIZE>
</PAGE>
</CENSUS>
<CENSUS>
<COUNTRY>UK</COUNTRY>
<YEAR>1871</YEAR>
<PAGE>
<SIZE>
<SMALL>9.5x15</SMALL>
<LARGE>9.5x15</LARGE>
</SIZE>
</PAGE>
</CENSUS>
<CENSUS>
<COUNTRY>UK</COUNTRY>
<YEAR>1891</YEAR>
<PAGE>
<SIZE>
<SMALL>11x16</SMALL>
<LARGE>11x16</LARGE>
</SIZE>
</PAGE>
</CENSUS>
</TYPE2>
<TYPE3>
<CENSUS>
<USA YEAR="1930">
<PAGE SIZE="SMALL">17x11</PAGE>
<PAGE SIZE="LARGE">27x19</PAGE>
</USA>
<USA YEAR="1880">
<PAGE SIZE="SMALL">17x11</PAGE>
<PAGE SIZE="LARGE">19x25</PAGE>
</USA>
<UK YEAR="1871">
<PAGE SIZE="SMALL">9.5x15</PAGE>
<PAGE SIZE="LARGE">9.5x15</PAGE>
</UK>
<UK YEAR="1891">
<PAGE SIZE="SMALL">11x16</PAGE>
<PAGE SIZE="LARGE">11x16</PAGE>
</UK>
</CENSUS>
</TYPE3>
<TYPE4>
<CENSUS>
<COUNTRY>
USA
<YEAR>
1930
<PAGE>
<SIZE TYPE="SMALL">17x11</SIZE>
<SIZE TYPE="LARGE">27x19</SIZE>
</PAGE>
</YEAR>
<YEAR>
1880
<PAGE>
<SIZE TYPE="SMALL">17x11</SIZE>
<SIZE TYPE="LARGE">19x25</SIZE>
</PAGE>
</YEAR>
</COUNTRY>
<COUNTRY>
UK
<YEAR>
1871
<PAGE>
<SIZE TYPE="SMALL">9.5x15</SIZE>
<SIZE TYPE="LARGE">9.5x15</SIZE>
</PAGE>
</YEAR>
<YEAR>
1891
<PAGE>
<SIZE TYPE="SMALL">11x16</SIZE>
<SIZE TYPE="LARGE">11x16</SIZE>
</PAGE>
</YEAR>
</COUNTRY>
</CENSUS>
</TYPE4>
</STUFF>
Background
Deciding when to use an element or an attribute to represent XML data is confusing for us beginners. Even more confusing is how to select the data when it is represented in different forms.
Using the Code
Just create a new C# console application called ConsoleXMLTest
and replace the body of Class1.cs with the following code. Create a file called data.xml and place the above XML into that file. Place it in the appropriate directory so that your application can locate it. I set a build event under properties to move data.xml from the project directory to the output directory automatically as thus:
copy "$(PRojectDir)data.xml" "$(TargetDir)"
using System;
using System.IO;
using System.Xml;
using System.Xml.XPath;
using System.Collections;
namespace ConsoleXMLTest
{
class Class1
{
[STAThread]
static void Main(string[] args)
{
string fileName = "data.xml";
FileStream fs = new FileStream(fileName,FileMode.Open,FileAccess.Read);
XmlTextReader reader = new XmlTextReader(fs);
TestOne(reader);
fs.Seek(0,SeekOrigin.Begin);
reader = new XmlTextReader(fs);
TestTwo(reader);
fs.Seek(0,SeekOrigin.Begin);
reader = new XmlTextReader(fs);
TestThree(reader);
fs.Seek(0,SeekOrigin.Begin);
reader = new XmlTextReader(fs);
TestFour(reader);
}
static void TestOne(XmlTextReader reader)
{
System.Console.WriteLine("TestOne");
XPathDocument xdoc = new XPathDocument(reader);
XPathNavigator nav = xdoc.CreateNavigator();
XPathNodeIterator nodeItor = nav.Select(
"STUFF/TYPE1/CENSUS[@COUNTRY='USA' and @YEAR='1930']/PAGE");
nodeItor.MoveNext();
TraverseSiblings(nodeItor);
System.Console.WriteLine();
}
static void TestTwo(XmlTextReader reader)
{
System.Console.WriteLine("TestTwo");
XPathDocument xdoc = new XPathDocument(reader);
XPathNavigator nav = xdoc.CreateNavigator();
XPathNodeIterator nodeItor = nav.Select(
"STUFF/TYPE2/CENSUS[COUNTRY='USA' and YEAR='1930']/PAGE/SIZE");
nodeItor.MoveNext();
TraverseChildren(nodeItor);
System.Console.WriteLine();
}
static void TestThree(XmlTextReader reader)
{
System.Console.WriteLine("TestThree");
XPathDocument xdoc = new XPathDocument(reader);
XPathNavigator nav = xdoc.CreateNavigator();
XPathNodeIterator nodeItor = nav.Select(
"STUFF/TYPE3/CENSUS/USA[@YEAR='1930']/PAGE");
nodeItor.MoveNext();
TraverseSiblings(nodeItor);
System.Console.WriteLine();
}
static void TestFour(XmlTextReader reader)
{
System.Console.WriteLine("TestFour");
XPathDocument xdoc = new XPathDocument(reader);
XPathNavigator nav = xdoc.CreateNavigator();
XPathNodeIterator nodeItor = nav.Select(
"STUFF/TYPE4/CENSUS/COUNTRY[normalize-space(text())='USA']"+
"/YEAR[normalize-space(text())='1930']/PAGE/SIZE");
nodeItor.MoveNext();
TraverseSiblings(nodeItor);
System.Console.WriteLine();
}
static void TraverseSiblings(XPathNodeIterator nodeItor)
{
XPathNodeIterator igor = nodeItor.Clone();
PrintNode(igor.Current);
igor.Current.MoveToNext();
bool more = false;
do
{
PrintNode(igor.Current);
more = igor.Current.MoveToNext();
}while(more); }
static void TraverseChildren(XPathNodeIterator nodeItor)
{
XPathNodeIterator igor = nodeItor.Clone();
igor.Current.MoveToFirstChild();
bool more = false;
do
{
PrintNode(igor.Current);
more = igor.Current.MoveToNext();
}while(more);
}
static void Traverse(XPathNodeIterator nodeItor)
{
Stack nodeStack = new Stack();
nodeStack.Push(nodeItor.Clone());
while(nodeStack.Count > 0)
{
XPathNodeIterator igor = (XPathNodeIterator)nodeStack.Pop();
if(igor.Current.HasChildren == false)
{
PrintNode(igor.Current);
}
else
{
XPathNodeIterator egor = igor.Clone();
egor.Current.MoveToFirstChild();
Stack reverseStack = new Stack();
reverseStack.Push(egor.Clone());
while(egor.Current.MoveToNext() == true)
{
reverseStack.Push(egor.Clone());
}
while(reverseStack.Count > 0)
{
nodeStack.Push(reverseStack.Pop());
}
}
}
}
static void PrintNode(XPathNavigator nav)
{
System.Console.WriteLine(nav.Name + ":" + nav.Value +
" Type : " + nav.NodeType.ToString());
}
}
}
Points of Interest
Learning how to select nodes using XPath
is not very difficult. Since I like to learn by example, I made this code to reinforce the things I learned from studying MSDN and various web sites.
To select a node that has a particular attribute:
XPathNodeIterator nodeItor = nav.Select(
"STUFF/TYPE1/CENSUS[@COUNTRY='USA' and @YEAR='1930']/PAGE");
The above query selects all PAGE
nodes that have a CENSUS
parent with attributes of USA
and 1930
.
To select a node that has a particular value:
XPathNodeIterator nodeItor = nav.Select(
"STUFF/TYPE2/CENSUS[COUNTRY='USA' and YEAR='1930']/PAGE/SIZE");
The above query selects all SIZE
nodes of the PAGE
nodes that have a CENSUS
parent that has COUNTRY
and YEAR
children with the respective values of USA
and 1930
.
TestFour
is of particular interest because I have XML elements that have a value and have children. During my studying of XML, I didn't come across any examples of this and at first I didn't think it could be done. Here is the XML data for TestFour
.
<TYPE4>
<CENSUS>
<COUNTRY>
USA
<YEAR>
1930
<PAGE>
<SIZE TYPE="SMALL">17x11</SIZE>
<SIZE TYPE="LARGE">27x19</SIZE>
</PAGE>
</YEAR>
<YEAR>
1880
<PAGE>
<SIZE TYPE="SMALL">17x11</SIZE>
<SIZE TYPE="LARGE">19x25</SIZE>
</PAGE>
</YEAR>
</COUNTRY>
<COUNTRY>
UK
<YEAR>
1871
<PAGE>
<SIZE TYPE="SMALL">9.5x15</SIZE>
<SIZE TYPE="LARGE">9.5x15</SIZE>
</PAGE>
</YEAR>
<YEAR>
1891
<PAGE>
<SIZE TYPE="SMALL">11x16</SIZE>
<SIZE TYPE="LARGE">11x16</SIZE>
</PAGE>
</YEAR>
</COUNTRY>
</CENSUS>
</TYPE4>
When selecting the YEAR
node and displaying the value, I would get all of the whitespace around the value as well. I learned to use this code:
XPathNodeIterator nodeItor = nav.Select(
"STUFF/TYPE4/CENSUS/COUNTRY[normalize-space(text())='USA']"+
"/YEAR[normalize-space(text())='1930']/PAGE/SIZE");
The query selects the COUNTRY
node that has the text equal to USA
with the whitespace stripped away. It does the same for the YEAR
.
Additionally, I have written several recursive routines to traverse an XML tree during my studies. In this code, I decided to use a non-recursive solution using a Stack
and a while
loop.
Notice that I have a habit of naming iterator variables igor. It came from seeing so many named itor and I couldn't help but think of Igor from Young Frankenstein. So you will see some Igors and some Egors in the code.
The results of the code is this:
TestOne
PAGE:17x11 Type : Element
PAGE:27x19 Type : Element
TestTwo
SMALL:17x11 Type : Element
LARGE:27x19 Type : Element
TestThree
PAGE:17x11 Type : Element
PAGE:27x19 Type : Element
TestFour
SIZE:17x11 Type : Element
SIZE:27x19 Type : Element
References
Here are some references concerning XPath:
History
License
This article has no explicit license attached to it, but may contain usage terms in the article text or the download files themselves. If in doubt, please contact the author via the discussion board below.
A list of licenses authors might use can be found here.
Master Degree in C.S. .NET, Unix, Macintosh (OS X, 9, 8...), PC server side, and MFC. 17 years experience. Graphics, Distributed processing, Object Oriented Methods and Models.
Java, C#, C++. Webservices. XML. Real name is Geoffrey Slinker.