Introduction
This project will allow the user to alphabetize any XML file without having to translate that file using XSLT.
I searched quite a bit for a way to sort an XML document alphabetically and most of the results I got were based on overly complicated XSLT translation. Some of the templates that I was working with to get my XML document sorted worked for some piece of the puzzle, but after spending a lot of time, I decided to start over using Linq and the XDocument
.
I ended up creating a solution that will sort any XML document based on the name of the XML node, any attribute of an XML node(s) and based on a particular depth within that XML document. This utility will also sort all attributes of the XML nodes alphabetically ascending or descending from left to right.
Examples
Original XML - Unsorted
<Collection>
<CollectionB>
<B2 Text="1" Stat="5"></B2>
<A2 Text="2" Type="3" Stat="7"></A2>
<A1></A1>
<B1 typeName="2">
<B22></B22>
<A22></A22>
<A12></A12>
</B1>
<B1 typeName="1"></B1>
</CollectionB>
<CollectionA>
<B2 Echo="1" Bravo="1" Tango="1" Alpha="1"></B2>
<A2></A2>
<A1></A1>
<B1></B1>
</CollectionA>
</Collection>
Default Sort
<Collection>
<CollectionA>
<A1 />
<A2 />
<B1 />
<B2 Echo="1" Bravo="1" Tango="1" Alpha="1" />
</CollectionA>
<CollectionB>
<A1 />
<A2 Text="2" Type="3" Stat="7" />
<B1 typeName="2">
<A12 />
<A22 />
<B22 />
</B1>
<B1 typeName="1" />
<B2 Text="1" Stat="5" />
</CollectionB>
</Collection>
Sort by "Text" Attribute
<Collection>
<CollectionA>
<A1 />
<A2 />
<B1 />
<B2 Echo="1" Bravo="1" Tango="1" Alpha="1" />
</CollectionA>
<CollectionB>
<B2 Text="1" Stat="5" />
<A2 Text="2" Type="3" Stat="7" />
<A1 />
<B1 typeName="2">
<A12 />
<A22 />
<B22 />
</B1>
<B1 typeName="1" />
</CollectionB>
</Collection>
Sort by "Text" Attribute below level 2 within XML Tree
<Collection>
<CollectionB>
<B2 Text="1" Stat="5" />
<A2 Text="2" Type="3" Stat="7" />
<A1 />
<B1 typeName="2">
<A12 />
<A22 />
<B22 />
</B1>
<B1 typeName="1" />
</CollectionB>
<CollectionA>
<B2 Echo="1" Bravo="1" Tango="1" Alpha="1" />
<A2 />
<A1 />
<B1 />
</CollectionA>
</Collection>
Default Sort - Sort Attributes Ascending
<Collection>
<CollectionB>
<B2 Stat="5" Text="1" />
<A2 Stat="7" Text="2" Type="3" />
<A1 />
<B1 typeName="2">
<A12 />
<A22 />
<B22 />
</B1>
<B1 typeName="1" />
</CollectionB>
<CollectionA>
<B2 Alpha="1" Bravo="1" Echo="1" Tango="1" />
<A2 />
<A1 />
<B1 />
</CollectionA>
</Collection>
Background
Lambda Expressions
I rely heavily on Lambda expressions within this code to cut down on the lines of code and to utilize Linq in the most efficient way. More concise code can be achieved by using Lambda expressions instead of delegates. Here is a quick example of a Lambda expression to give you some background into the syntax before we dive into the code.
For this example, let's take a strongly typed collection of Customer
objects.
List<Customer> myCustomers;
A Customer
has a public
property FirstName
.
Once this collection is filled with Customer
objects, we can then sort that collection based on each Customer.FirstName
.
list = list.OrderBy(x => x.FirstName).toList(); --Sort ascending
list = list.OrderByDescending(x => x.FirstName).toList(); --Sort descending
In the syntax above, the "x
" represents a Customer
object. You can use any variable name you choose.
Hint: When you are typing a Lambda expression and you get to the part of the syntax where you type the "." after "x", you should get code completion and be able to choose from any visible property of that object.
Conditional Operator
Conditional Operator within that orderby
clause. I could find no other supported way to perform an inline sort and apply logic at the same time. In this example, I am using a two level conditional statement to verify that the current child node has children below the chosen level and to verify that the current child node has valid attributes; If these conditions are met, then sort by the value of the attribute that has the name we specified in the UI.
This operator is very useful when trying to assign a variable to another variable only if a certain condition is true
and if not, then assign it to another value. The operator uses ? and : and is basically (if true) ? then : else.
Normal "if
" syntax:
if (myValue == "something")
{
otherValue = myValue;
}
else
{
otherValue = "defaultValue";
}
Simplified syntax using the Conditional Operator:
otherValue = (myValue == "something") ? myValue : "defaultValue";
Using the Utility
In this example (Figure 1), I am loading the XML file "Test.xml", starting my sort at node level 2, sorting by the attribute "Text
" and sorting all XML attributes alphabetically in ascending order left to right.
All results are saved to a file called "result.xml" in the working directory (overwrite) each run as well as shown in the TextArea
at the bottom of the form window.
Figure 1
Note: If a full path is not specified for the Source XML file, I look in the working directory and append ".xml" to the file name if not present.
Using the Code
The core of this utility utilizes the XDocument
Linq object to load all of the XML, manipulate it, then spit it out in the order requested.
XDocument doc = XDocument.Load(sourceDoc);
XDocument sortedDoc = Sort(doc, level, attribute, sortAttributes);
XmlWriterSettings settings = new XmlWriterSettings();
settings.OmitXmlDeclaration = true;
settings.Indent = true;
using (XmlWriter writer = XmlWriter.Create(resultDoc, settings))
{
sortedDoc.WriteTo(writer);
}
To perform the sort on the XML tree, we use Linq to query the collection of XML objects, apply an "orderby
" clause based on the options that were selected from the UI, then recursively call the Sort
method for all the nodes found.
If we choose to sort the attributes of all XML nodes, then we will enter into the switch
statement and apply a sort to the collection of XAttribute
objects associated with that particular XML node.
#region Private Method - Sort
private static XDocument Sort(XDocument file,
int level, string attribute, int sortAttributes)
{
return new XDocument(Sort(file.Root, level, attribute, sortAttributes));
}
private static XElement Sort(XElement element,
int level, string attribute, int sortAttributes)
{
XElement newElement = new XElement(element.Name,
from child in element.Elements()
orderby
(child.Ancestors().Count() > level)
? (
(child.HasAttributes &&
!string.IsNullOrEmpty(attribute)
&& child.Attribute(attribute) != null)
? child.Attribute(attribute).
Value.ToString()
: child.Name.ToString()
)
: ""
select Sort(child, level, attribute, sortAttributes));
if (element.HasAttributes)
{
switch (sortAttributes)
{
case 0:
foreach (XAttribute attrib in element.Attributes())
{
newElement.SetAttributeValue
(attrib.Name, attrib.Value);
}
break;
case 1:
foreach (XAttribute attrib in element.Attributes().
OrderBy(a => a.Name.ToString()))
{
newElement.SetAttributeValue
(attrib.Name, attrib.Value);
}
break;
case 2:
foreach (XAttribute attrib in element.Attributes().
OrderByDescending(a => a.Name.ToString()))
{
newElement.SetAttributeValue
(attrib.Name, attrib.Value);
}
break;
default:
break;
}
}
return newElement;
}
#endregion
Points of Interest
This is the first time that I have the need to use a Linq query and after finding a good example online, I was able to use it to successfully iterate through the entire collection, perform an orderby
on the objects, then perform an action based on each item returned.
One thing that I learned is that if you need to add some logic within an orderby
clause of a Linq query, you must use a:
XElement newElement = new XElement(element.Name,
from child in element.Elements()
orderby
(child.Ancestors().Count() > level)
? (
(child.HasAttributes &&
!string.IsNullOrEmpty(attribute) &&
child.Attribute(attribute) != null)
? child.Attribute(attribute).Value.ToString()
: child.Name.ToString()
)
: ""
select Sort(child, level, attribute, sortAttributes));
History
- Version 1.0 - Original submission