Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / XML

XML Alphabetizer

4.53/5 (13 votes)
8 Mar 2011CPOL4 min read 100.2K   5.7K  
This project will allow the user to alphabetize any XML file without having to translate that file using XSLT or modify it in any way other than ordering nodes based on the names of the nodes themselves or based on an attribute of any node.

Introduction

This project will allow the user to alphabetize any XML file without having to translate that file using XSLT.

I searched quite a bit for a way to sort an XML document alphabetically and most of the results I got were based on overly complicated XSLT translation. Some of the templates that I was working with to get my XML document sorted worked for some piece of the puzzle, but after spending a lot of time, I decided to start over using Linq and the XDocument.

I ended up creating a solution that will sort any XML document based on the name of the XML node, any attribute of an XML node(s) and based on a particular depth within that XML document. This utility will also sort all attributes of the XML nodes alphabetically ascending or descending from left to right.

Examples

Original XML - Unsorted

XML
<Collection>
   <CollectionB>
      <B2 Text="1" Stat="5"></B2>
      <A2 Text="2" Type="3" Stat="7"></A2>
      <A1></A1>
      <B1 typeName="2">
		  <B22></B22>
		  <A22></A22>
		  <A12></A12>
	  </B1>
      <B1 typeName="1"></B1>
   </CollectionB>
   <CollectionA>
      <B2 Echo="1" Bravo="1" Tango="1" Alpha="1"></B2>
      <A2></A2>
      <A1></A1>
      <B1></B1>
   </CollectionA>
</Collection>

Default Sort

XML
<Collection>
  <CollectionA>
    <A1 />
    <A2 />
    <B1 />
    <B2 Echo="1" Bravo="1" Tango="1" Alpha="1" />
  </CollectionA>
  <CollectionB>
    <A1 />
    <A2 Text="2" Type="3" Stat="7" />
    <B1 typeName="2">
      <A12 />
      <A22 />
      <B22 />
    </B1>
    <B1 typeName="1" />
    <B2 Text="1" Stat="5" />
  </CollectionB>
</Collection>

Sort by "Text" Attribute

XML
<Collection>
  <CollectionA>
    <A1 />
    <A2 />
    <B1 />
    <B2 Echo="1" Bravo="1" Tango="1" Alpha="1" />
  </CollectionA>
  <CollectionB>
    <B2 Text="1" Stat="5" />
    <A2 Text="2" Type="3" Stat="7" />
    <A1 />
    <B1 typeName="2">
      <A12 />
      <A22 />
      <B22 />
    </B1>
    <B1 typeName="1" />
  </CollectionB>
</Collection>

Sort by "Text" Attribute below level 2 within XML Tree

XML
<Collection>
  <CollectionB>
    <B2 Text="1" Stat="5" />
    <A2 Text="2" Type="3" Stat="7" />
    <A1 />
    <B1 typeName="2">
      <A12 />
      <A22 />
      <B22 />
    </B1>
    <B1 typeName="1" />
  </CollectionB>
  <CollectionA>
    <B2 Echo="1" Bravo="1" Tango="1" Alpha="1" />
    <A2 />
    <A1 />
    <B1 />
  </CollectionA>
</Collection>

Default Sort - Sort Attributes Ascending

XML
<Collection>
  <CollectionB>
    <B2 Stat="5" Text="1" />
    <A2 Stat="7" Text="2" Type="3" />
    <A1 />
    <B1 typeName="2">
      <A12 />
      <A22 />
      <B22 />
    </B1>
    <B1 typeName="1" />
  </CollectionB>
  <CollectionA>
    <B2 Alpha="1" Bravo="1" Echo="1" Tango="1" />
    <A2 />
    <A1 />
    <B1 />
  </CollectionA>
</Collection>

Background

Lambda Expressions

I rely heavily on Lambda expressions within this code to cut down on the lines of code and to utilize Linq in the most efficient way. More concise code can be achieved by using Lambda expressions instead of delegates. Here is a quick example of a Lambda expression to give you some background into the syntax before we dive into the code.

For this example, let's take a strongly typed collection of Customer objects.

C#
List<Customer> myCustomers;

A Customer has a public property FirstName.
Once this collection is filled with Customer objects, we can then sort that collection based on each Customer.FirstName.

C#
list = list.OrderBy(x => x.FirstName).toList(); --Sort ascending
list = list.OrderByDescending(x => x.FirstName).toList(); --Sort descending

In the syntax above, the "x" represents a Customer object. You can use any variable name you choose.

Hint: When you are typing a Lambda expression and you get to the part of the syntax where you type the "." after "x", you should get code completion and be able to choose from any visible property of that object.

Conditional Operator

Conditional Operator within that orderby clause. I could find no other supported way to perform an inline sort and apply logic at the same time. In this example, I am using a two level conditional statement to verify that the current child node has children below the chosen level and to verify that the current child node has valid attributes; If these conditions are met, then sort by the value of the attribute that has the name we specified in the UI.

This operator is very useful when trying to assign a variable to another variable only if a certain condition is true and if not, then assign it to another value. The operator uses ? and : and is basically (if true) ? then : else.

Normal "if" syntax:

C#
if (myValue == "something")
{
	otherValue = myValue;
}
else
{
	otherValue = "defaultValue";
}

Simplified syntax using the Conditional Operator:

C#
otherValue = (myValue == "something") ? myValue : "defaultValue";

Using the Utility

In this example (Figure 1), I am loading the XML file "Test.xml", starting my sort at node level 2, sorting by the attribute "Text" and sorting all XML attributes alphabetically in ascending order left to right.

All results are saved to a file called "result.xml" in the working directory (overwrite) each run as well as shown in the TextArea at the bottom of the form window.

XML Alphabetizer

Figure 1

Note: If a full path is not specified for the Source XML file, I look in the working directory and append ".xml" to the file name if not present.

Using the Code

The core of this utility utilizes the XDocument Linq object to load all of the XML, manipulate it, then spit it out in the order requested.

C#
XDocument doc = XDocument.Load(sourceDoc);

//Add these two hard-coded values as options from within the UI 
//so we can apply it to any XML file.
XDocument sortedDoc = Sort(doc, level, attribute, sortAttributes);

XmlWriterSettings settings = new XmlWriterSettings();
settings.OmitXmlDeclaration = true;
settings.Indent = true;

using (XmlWriter writer = XmlWriter.Create(resultDoc, settings))
{
	sortedDoc.WriteTo(writer);
}

To perform the sort on the XML tree, we use Linq to query the collection of XML objects, apply an "orderby" clause based on the options that were selected from the UI, then recursively call the Sort method for all the nodes found.

If we choose to sort the attributes of all XML nodes, then we will enter into the switch statement and apply a sort to the collection of XAttribute objects associated with that particular XML node.

C#
#region Private Method - Sort
/// <summary>
/// Sort an XML Element based on a minimum level to perform the
/// sort from and either based on the value
/// of an attribute of an XML Element or by the name of the XML Element.
/// </summary>
/// <param name="file">File to load and sort</param>
/// <param name="level">Minimum level to apply the sort from.
/// 0 for root level.</param>
/// <param name="attribute">Name of the attribute to sort by.
/// "" for no sort</param>
/// <param name="sortAttributes">Sort attributes none,
/// ascending or descending for all sorted XML nodes</param>
/// <returns>Sorted XElement based on the criteria passed in.</returns>
private static XDocument Sort(XDocument file,
	int level, string attribute, int sortAttributes)
{
	return new XDocument(Sort(file.Root, level, attribute, sortAttributes));
}
/// <summary>
/// Sort an XML Element based on a minimum level to perform the
/// sort from and either based on the value
/// of an attribute of an XML Element or by the name of the XML Element.
/// </summary>
/// <param name="element">Element to sort</param>
/// <param name="level">Minimum level to apply the sort from.
/// 0 for root level.</param>
/// <param name="attribute">Name of the attribute to sort by.
/// "" for no sort</param>
/// <param name="sortAttributes">Sort attributes none,
/// ascending or descending for all sorted XML nodes</param>
/// <returns>Sorted XElement based on the criteria passed in.</returns>
private static XElement Sort(XElement element,
	int level, string attribute, int sortAttributes)
{
	XElement newElement = new XElement(element.Name,
		from child in element.Elements()
		orderby
			(child.Ancestors().Count() > level)
				? (
					(child.HasAttributes && 
						!string.IsNullOrEmpty(attribute)
					&& child.Attribute(attribute) != null)
						? child.Attribute(attribute).
							Value.ToString()
						: child.Name.ToString()
					)
				: ""  //End of the orderby clause
		select Sort(child, level, attribute, sortAttributes));
	if (element.HasAttributes)
	{
		switch (sortAttributes)
		{
			case 0: //None
				foreach (XAttribute attrib in element.Attributes())
				{
					newElement.SetAttributeValue
						(attrib.Name, attrib.Value);
				}
				break;
			case 1: //Ascending
				foreach (XAttribute attrib in element.Attributes().
				OrderBy(a => a.Name.ToString()))
				{
					newElement.SetAttributeValue
						(attrib.Name, attrib.Value);
				}
				break;
			case 2: //Descending
				foreach (XAttribute attrib in element.Attributes().
				OrderByDescending(a => a.Name.ToString()))
				{
					newElement.SetAttributeValue
						(attrib.Name, attrib.Value);
				}
				break;
			default:
				break;
		}
	}
	return newElement;
}
#endregion

Points of Interest

This is the first time that I have the need to use a Linq query and after finding a good example online, I was able to use it to successfully iterate through the entire collection, perform an orderby on the objects, then perform an action based on each item returned.

One thing that I learned is that if you need to add some logic within an orderby clause of a Linq query, you must use a:

C#
XElement newElement = new XElement(element.Name,
	from child in element.Elements()
	orderby
		(child.Ancestors().Count() > level)
			? (
				(child.HasAttributes && 
				!string.IsNullOrEmpty(attribute) && 
				child.Attribute(attribute) != null)
					? child.Attribute(attribute).Value.ToString()
					: child.Name.ToString()
				)
			: ""  //End of the orderby clause
	select Sort(child, level, attribute, sortAttributes));

History

  • Version 1.0 - Original submission

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)