Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / XSLT

Transform From One XML Structure to Another

5.00/5 (4 votes)
5 Jul 2014CPOL3 min read 61.2K   502  
Two approaches for how to transform XML data to another format

(Not a complete project, but a program file and the XML files)

Introduction

XML structures are very common and sometimes it is necessary to transform from one XML structure into another.

This can be done in many ways and here I am going to show two approaches.

Background

Recently a member asked a question about how to convert from one XML format to another. I tried to help with a solution but his question was closed before I could post another one that would suit his needs better.

I thought it was a waste of my time just to throw away my solutions, so I decided to post them here instead. I will use his XML files as sample data here.

The complete question can be found here.

Using the Code

XML Structures

Below are the extracts of the two XML structures.

Input XML

This is the structure of the XML that is going to be transformed.

XML
<?xml version="1.0" encoding="utf-16"?>
<item_authorizes generated_time="2014/05/07 18:01:49">
  <item_authorize>
    <id>1</id>
    <name>test_1</name>
    <enchant_attr_list>
      <data>
        <level>1</level>
        <attr1>PvPAttackRatio_Physical_O 2</attr1>
        <attr2>damage_physical 10</attr2>
        <attr3>hit_accuracy 5</attr3>
        <attr4>block 10</attr4>
      </data>
      <data>
        <level>2</level>
        <attr1>PvPAttackRatio_Physical_O 4</attr1>
        <attr2>damage_magical 10</attr2>
        <attr3>physical_defend 10</attr3>
        <attr4>hit_accuracy 20</attr4>
      </data>
    </enchant_attr_list>
  </item_authorize>
</item_authorizes>

Output XML

And this is the resulting XML structure:

XML
<?xml version="1.0" encoding="utf-8"?>
<tempering_tables xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:noNamespaceSchemaLocation="tempering_table.xsd">
  <tempering_table id="1">
    <modifiers level="1">
      <add name="PVP_ATTACK_RATIO_PHYSICAL" value="2" bonus="true"/>
      <add name="PHYSICAL_DAMAGE" value="10" bonus="true"/>
      <add name="MAGICAL_ACCURACY" value="5" bonus="true"/>
      <add name="BLOCK" value="10" bonus="true"/>
    </modifiers>
    <modifiers level="2">
      <add name="PVP_ATTACK_RATIO_PHYSICAL" value="4" bonus="true"/>
      <add name="MAGICAL_DAMAGE" value="10" bonus="true"/>
      <add name="PHYSICAL_DEFENSE" value="10" bonus="true"/>
      <add name="MAGICAL_ACCURACY" value="20" bonus="true"/>
    </modifiers>
  </tempering_table>
</tempering_tables>

Solution 1

My first solution is using XSLT, XSL Transform.
(XSL stands for EXtensible Stylesheet Language)
XSLT is designed for; transforming XML into something else, e.g. XML to XML, XML to HTML or XML to text.

Learning XSLT has a bit of a threshold, but in my opinion, it is a very useful tool to have when dealing with XML.
I will not try to teach anyone how to master XSLT here. There is several sites dedicated for this.
For example, W3Schools: http://www.w3schools.com/xsl/default.asp.

The XSLT Code

The basics of XSLT is to create templates that match the nodes in the structure.
Inside the template, elements can be processed or if the node is just a container, the control is passed to the next template.
All nodes inside a parent node will be processed automatically, so there is no need for a for-each loop in this case.

Filename = Xml2XML.xslt

XML
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"
>
  <xsl:output method="xml" indent="yes" version="1.0" />

  <!-- Code used to convert to upper case. XSLT 1.0 lacks a predefined function for this. -->
  <xsl:variable name="smallcase" select="'abcdefghijklmnopqrstuvwxyz'" />
  <xsl:variable name="uppercase" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'" />
  <xsl:template name="to-upper">
    <xsl:param name="input" />
    <xsl:value-of select="translate($input, $smallcase, $uppercase)" />
  </xsl:template>
  
  <xsl:template match="item_authorizes">
    <tempering_tables xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
    xsi:noNamespaceSchemaLocation="tempering_table.xsd">
      <xsl:apply-templates select="item_authorize" />
    </tempering_tables>
  </xsl:template>

  <xsl:template match="item_authorize">
    <tempering_table>
      <xsl:attribute name="id">
        <xsl:value-of select="id"/>
      </xsl:attribute>
      <xsl:apply-templates select="enchant_attr_list" />
    </tempering_table>
  </xsl:template>

  <xsl:template match="enchant_attr_list">
    <xsl:apply-templates select="data" mode="new" />
  </xsl:template>

  <xsl:template match="data" mode="new">
    <modifiers>
      <xsl:attribute name="level">
        <xsl:value-of select="level"/>
      </xsl:attribute>
      <xsl:apply-templates select="node()" />
    </modifiers>
  </xsl:template>

  <xsl:template match="node()">
    <xsl:if test="substring(name(.), 1, 4) = 'attr'">
      <add>
        <xsl:attribute name="name">
          <xsl:call-template name="to-upper">
            <xsl:with-param name="input">
              <xsl:value-of select="substring-before(.,' ')"/>
            </xsl:with-param>
          </xsl:call-template>
        </xsl:attribute>
        <xsl:attribute name="value">
          <xsl:value-of select="substring-after(.,' ')"/>
        </xsl:attribute>
        <xsl:attribute name="bonus">
          <xsl:text>true</xsl:text>
        </xsl:attribute>
      </add>
    </xsl:if>
  </xsl:template>
</xsl:stylesheet>

Visual Studio has built in support for XSLT. When you are editing an XML file, the menu XML will appear.
You have two options for XSLT in this menu, Start XSLT Debugging and Start XSLT Without Debugging.

This is very useful when trying XSLT out.

The C# Code

This method is generic and can be used for converting any input XML file to whatever output format the XSLT schema defines.

C#
public void ConvertXmlWithXSLT(string xsltFile, string inputFile, string outputFile)
{
  // Load the XSLT schema into the tranform object
  XslCompiledTransform xslt = new XslCompiledTransform(false);
  using (StreamReader srXslt = new StreamReader(xsltFile))
  {
    XmlReader readerXslt = XmlReader.Create(srXslt);
    xslt.Load(readerXslt);
  }

  // Create and open the output file
  using (FileStream fsOutput = File.Create(outputFile))
  {
    XmlWriterSettings xmlSettings = new XmlWriterSettings();
    xmlSettings.Indent = true;
    XmlWriter writerXML = XmlTextWriter.Create(fsOutput, xmlSettings);

    // Open the input file
    using (XmlReader readerInput = XmlReader.Create(inputFile))
    {
      xslt.Transform(readerInput, null, writerXML);
    }
  }
}

Unsolved Issues

The problem with this solution in this particular case is that some of the names should be converted to another convention.

  • PvPAttackRatio_Physical_O should be translated to PVP_ATTACK_RATIO_PHYSICAL
  • damage_physical should be translated to PHYSICAL_DAMAGE
  • and more of that

It is entirely possible to solve this within XSLT, but I felt it might not be the best approach in this case.
Hence, I came up with another solution.

Solution 2

In this case, XDocument and XElement will be used to do the transformation. The resulting method is dedicated to perform this specific transformation, but the principle should be possible to reuse.

In order to solve the problem with the changed naming convention, I created a lookup table.

C#
static Dictionary<string, string> nameLookup = new Dictionary<string, string>();

Which is then filled using this method:

C#
static void CreateLookupTable()
{
  // Add the names that requires translation
  nameLookup.Add("PvPAttackRatio_Physical_O", "PVP_ATTACK_RATIO_PHYSICAL");
  nameLookup.Add("damage_physical", "PHYSICAL_DAMAGE");
  nameLookup.Add("hit_accuracy", "PHYSICAL_ACCURACY");
  nameLookup.Add("attack_delay", "ATTACK_SPEED");
  nameLookup.Add("physical_defend", "PHYSICAL_DEFENSE");
  nameLookup.Add("PvPAttackRatio_Magical_O", "PVP_ATTACK_RATIO_MAGICAL");
  nameLookup.Add("magical_hit_accuracy", "MAGICAL_ACCURACY");
  nameLookup.Add("dodge", "EVASION");
  nameLookup.Add("PvPDefendRatio_Physical_O", "PVP_DEFEND_RATIO_PHYSICAL");
}

And finally, the method that performs the transformation.

C#
public void ConvertXmlWithLinq(string inputFile, string outputFile)
{
  XDocument xdInput = XDocument.Load(inputFile);
  XElement xeOutputRoot = new XElement("tempering_tables");
  xeOutputRoot.Add(
    new XAttribute(XNamespace.Xmlns + "xsi", "http://www.w3.org/2001/XMLSchema-instance"));
  xeOutputRoot.Add(
    new XAttribute(XNamespace.Get("http://www.w3.org/2001/XMLSchema-instance") + 
    "noNamespaceSchemaLocation", "tempering_table.xsd"));

  // Loop through the input XML and create the transformed XML
  foreach (XElement xeItemAuthorize in xdInput.Root.Elements(XName.Get("item_authorize")))
  {
    XElement xeTemperingTable = new XElement("tempering_table");
    int id = int.Parse(xeItemAuthorize.Element("id").Value);
    xeTemperingTable.Add(new XAttribute("id", id));

    foreach (XElement xeEnchanterAttribute in xeItemAuthorize.Elements("enchant_attr_list"))
    {
      foreach (XElement xeData in xeEnchanterAttribute.Elements("data"))
      {
        XElement xeModifiers = new XElement("modifiers");
        foreach (XElement xeDataMembers in xeData.Elements())
        {
          if (xeDataMembers.Name.LocalName == "level")
          {
            xeModifiers.Add(new XAttribute("level", xeDataMembers.Value));
          }
          else if (xeDataMembers.Name.LocalName.StartsWith("attr"))
          {
            // Split the value into two parts.
            string[] parts = xeDataMembers.Value.Split(' ');
            if (parts.Length != 2)
              throw new Exception(
                 String.Format("The attribute value {0} is not supported.", 
                  xeDataMembers.Value));
                                
            // Assign an existing value from the lookup table
            // or use the original value
            string name = nameLookup.ContainsKey(parts[0]) ? 
              nameLookup[parts[0]] : parts[0].ToUpper()  ;
            XElement xeAdd = new XElement("add");
            xeAdd.Add(new XAttribute("name", name));
            xeAdd.Add(new XAttribute("value", parts[1]));
            xeAdd.Add(new XAttribute("bonus", true));

            xeModifiers.Add(xeAdd);
          }
          else
          {
            throw new Exception(String.Format("Unsupported node found: '{0}'."
              , xeDataMembers.Name.LocalName));
          }
        }

        xeTemperingTable.Add(xeModifiers);
      }
    }
    xeOutputRoot.Add(xeTemperingTable);
  }

  // Now save the result in a file
  XDocument xdOutput = new XDocument();
  xdOutput.Declaration = new XDeclaration("1.0", "utf-8", "yes");
  xdOutput.AddAnnotation("Automatic XML translation.");
  xdOutput.Add(xeOutputRoot);
  xdOutput.Save(outputFile);
}

History

  1. First release
  2. Changed nameLookup.Add("hit_accuracy", "PHYSICAL"); to nameLookup.Add("hit_accuracy", "PHYSICAL_ACCURACY");

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)