Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / productivity / Office

Document and Code Generation by LINQ and XSL

3.87/5 (8 votes)
3 Aug 2008CPOL15 min read 1   340  
An article on how to generate source code as well as populate Excel Spreadsheets.

Introduction

Sometimes, you have to work in an environment that does not allow you to use third party tools and applications. Work smarter, not harder, is the old adage. By using existing technologies in the .NET framework, you can work smarter. I will take you through the steps required to implement a practical solution, and complete the article by providing a way to produce mass billing invoice statements in Excel.

pic_0.jpg

Background

This article assumes that you have some rudimentary knowledge of XML, XSL, and LINQ, as well as C# and VB.NET. These technologies are the basis of this article. We will take a look and see how they can be used to reach our goal: create dynamic files.

The Code

There are four projects located within the zip file:

  • Contact List – Demo of the XML Literal type.
  • LINQ Code Generation – This will show how to generate C# Business Objects using a LINQ XML Literal and XslCompiledTransform.
  • Call External DLL – This will show how to call an external DLL from within a XSL C# script
  • Billing Code – This shows how to use a LINQ XML Literal with XslCompiledTransform to create a Microsoft Office 2007 Excel document to produce an invoice.

General Information

The XslCompiledTransform class was introduced in .NET v2.0, and is meant to replace the XslTransform class available in .NET v1.1. This class will use an XSL stylesheet to transform XML data to produce a new output format. This class is the crutch of the article, and is greatly enhanced with the help of LINQ XML Literal types.

The LINQ XML Literal type is new to .NET v3.5, and only exists in VB.NET at the current time. It allows you to write XML in a more natural state. The following two statements are equivalent. The first could be written in C# .NET, while the second one can’t. The second is the new LINQ XML Literal type.

VB
Dim xml = New XElement("Person", _
   New XElement("Address", _
       New XAttribute("Street", "123 Main Street"), _
       New XAttribute("City", "Tampa"), _
       New XAttribute("State", "FL")))

Dim xml2 = <Person>
        <Address Street="123 Main Street" City="Tampa" State="FL"/>
       </Person>

With the new XML Literal type, writing code to auto-generate code and documents becomes much easier, cleaner, and will produce a clearer understanding for maintenance.

Demo 1 – Contact Listing

This demo takes a list of names and produces an XML output. The syntax between <%= and %> inclusively is a new syntax, the LINQ query language. For the remainder of this article, I’m going to stick to this simplest form. The XML query language fills volumes of books, and it is not necessary to understand the more complex aspects of it to understand this article. Once you understand the basics of this article and the LINQ query, some really advanced queries can be performed.

The following piece of code will iterate over a list of names and produce the XML structure. The syntax is similar to the SQL query language with the exception that the Select statement comes last and what you are querying doesn’t have to be a database table. The data source just needs to be accessible by use of an iterator.

VB
Module Module1
   Sub Main()
      Dim names As String() = {"Dave", "Kelly", "Tom"}
      Dim xml = <Contacts>
                   <Names>
                      <%= From r In names _
                         Select <Name><%= r %></Name> %>
                   </Names>
                </Contacts>

      Console.Out.WriteLine(xml)

      Console.WriteLine(String.Format("{0}{0}{1}", _
         Environment.NewLine, "Press Enter To Continue..."))
      
      Console.ReadLine()
    End Sub
End Module

<Contacts>
  <Names>
    <Name>Dave</Name>
    <Name>Kelly</Name>
    <Name>Tom</Name>
  </Names>
</Contacts>

Press Enter To Continue...

Demo 2 – LINQ Code Generation

Demo 1 shows us how easy it is to use LINQ to create dynamic XML easily. This demo will introduce the XslCompiledTransform class which will allow us to use C# code within an XSL stylesheet to produce dynamic output as well.

While code generation is an ongoing field of study with many implementations and different ideas on how to accomplish the end goal, I too will introduce another methodology. While this article is not directly aimed at how to generate code, it is the best way to introduce some concepts to make it easier.

The part to remember when creating a code generator is to do it slowly. Start with the end product in mind, and slowly work backwards. This is the approach that was taken with Demo 2. You will see how I broke it down with Tests 1, 2, 3, and then the final project.

To start, I will create an XElement object using the final output as a starting point.

VB
Dim xslt = <class_1>
           --- Cut and paste the whole class here. ---
       </class_1>

This is the data that will be transformed by the XSL stylesheet by using the XslCompiledTransform class.

The XslCompiledTransform class at a minimum takes an XML stylesheet. The stylesheet can be saved on disk or created in memory. If the stylesheet is in memory, it can be loaded into the XslCompiledTransform by means of a class that implements IXPathNavigable, such as XElement. XElement is another new product of the LINQ language. Although you don’t realize it, you have used it already in Demo 1. The variable xml is created as its true data type, a XElement. I will be working with the new XElement data type for this demo. Both the data and the XSL will be loaded into an XElement data type.

Once again, taking the simple approach, I will create an XElement data structure for the XSL. The XSL itself will be the simplest form possible. Cutting down on complexity will lead to easier understanding and better reusability.

The XSL looks like the following:

VB
Dim xslt = <xsl:stylesheet version="1.0"
        xmlns:xsl=http://www.w3.org/1999/XSL/Transform
        xmlns:msxsl="urn:schemas-microsoft-com:xslt"
        xmlns:user="urn:my-scripts">
        
        <xsl:template match="class_1">
            <xsl:value-of select="."/>
        </xsl:template>
       </xsl:stylesheet>

The following code will be used to transform the data according to the XSL stylesheet and output it to the console. Enabling scripting at this point is not necessary, but will be later.

C#
// Enable scripting
XsltSettings settings = new XsltSettings(true, true);

// Create the XSL Transformation class
XslCompiledTransform xslt = new XslCompiledTransform();
xslt.Load(xsltScript.CreateReader(), settings, null);

// Load the data
XPathDocument doc = new XPathDocument(dataScript.CreateReader());

// Output to console using XmlTextWriter
XmlTextWriter writer = new XmlTextWriter(Console.Out);
writer.Formatting = Formatting.Indented;

// Transform the data with the XSLT stylesheet
xslt.Transform(doc.CreateNavigator(), writer);
writer.Close();

By running the code, you should see the original class output to the console. While this is not too exciting, it provides a base on which to build upon. Anything we add after this point should be confined to small changes, and retested in order to verify we didn’t introduce errors.

As the next step, Test 2, add more XML tags around things that you want to output as text and things you want to generate as data. An example of this is the <data_2> tag, and add the new tags into the XSL stylesheet.

C#
<data_2>
private Guid _id;
private int _street;
private string _address;
private string _city;
private string _state;
</data_2>


<xsl:template match="class_1">
   <xsl:value-of select="data_1"/>
   <xsl:value-of select="data_2"/>
   <xsl:value-of select="data_3"/>
   <xsl:value-of select="data_4"/>
   <xsl:value-of select="data_5"/>
   <xsl:value-of select="data_6"/>
   <xsl:value-of select="data_7"/>
</xsl:template>

I will use a common naming strategy through the rest of the project. This will continue to from Test 2 to 3 and the final project. At this point, running the project will give us the same final output.

Test 3 begins to assemble LINQ with XSL. We replace the class fields in the <data_2 /> XML tag with the following LINQ code:

XML
<data_2>
   <%= From r In table.Rows _
     Select <data_2a>
             private <%= r(1) %><%= " " %><%= r(0) %>;
        </data_2a> %>
</data_2>

The LINQ code uses a DataTable to iterate over. For each row in the table, LINQ will output an XML tag <data_2a />, the text "private" followed by the value in column 1, a space, and then the value in column 0, followed by the closing tag.

In the XSL stylesheet, we replace the XSL tag <value-of select="data_2" /> with the following:

XML
<xsl:for-each select="data_2/data_2a">
   <xsl:value-of select="."/>
</xsl:for-each>

When we run the project this time, we end up with SQL Server data types and fields that don’t conform to the standard naming convention for private fields. We will address this issue in the final project.

We begin the final project by writing some normal C# code. This code is going to be used to help us output code in a structured manner as well as convert SQL Server data types to C# data types.

C#
public static string CreateProperty(string fieldType, string fieldName)
{
    StringBuilder sb = new StringBuilder();

    sb.Append("\n");
    sb.Append("        /// <summary>\n");
    sb.Append("        /// Get/Set for " + PropertyName(fieldName) + "\n");
    sb.Append("        /// </summary>\n");
    sb.Append("        public " + FieldType(fieldType) + " " + 
                       PropertyName(fieldName) + "\n");
    sb.Append("        {\n");
    sb.Append("            get { return (" + FieldName(fieldName) + "); }\n");
    sb.Append("            set { " + FieldName(fieldName) + " = value; }\n");
    sb.Append("        }\n");

    return (sb.ToString());
}

public static string FieldType(string fieldtype)
{
    switch (fieldtype.ToLower())
    {
        case "smallint":
            return "short";
        case "uniqueidentifier":
            return "Guid";
        case "varchar":
            return "string";
        default:
            return fieldtype;
    }
}

With the completion of the conversion and formatting code, we copy the code into the XSL stylesheet in a new section <msxsl:script>.

VB
Dim xslt = <xsl:stylesheet version="1.0"
        xmlns:xsl=http://www.w3.org/1999/XSL/Transform
        xmlns:msxsl="urn:schemas-microsoft-com:xslt"
        xmlns:user="urn:my-scripts">
            <msxsl:script language="C#" implements-prefix="user">
               <![CDATA[
               
                Insert C# Code here
                
               ]]>
            
            <xsl:template match="class_1">
               <xsl:value-of select="data_1"/>

               <xsl:for-each select="data_2/data_2a">
                   <xsl:value-of select="text()"/>
                   <xsl:value-of select="user:FieldType(fieldtype)"/>
                   <xsl:value-of select="user:InsertText(' ')"/>
                   <xsl:value-of select="user:FieldName(fieldname)"/>
                   <xsl:value-of select="user:InsertText(';')"/>
               </xsl:for-each>
                .
                .
                .
            </xsl:template>
       </xsl:stylesheet>

The other change is to the XSL tag for data_2. The Select statements have some unusual syntax that we will look at in greater detail. A Select statement breaks down in the following manner:

  • user: - This is an XML prefix. This is defined as an attribute of the <script> tag. The value “user” could have been defined as “foo”, “bar”, or anything else you want. The only restriction is that the value must match in <msxsl:script> and the XSL usage.
  • FieldType - This is the name of a C# method that was previously defined and placed in the text portion of the XSL tag, <script>.
  • (fieldtype) - This the name of the XML data field that will be passed as a parameter to the C# method.

The XSL <data_2 /> tag is replaced once again with the following:

XML
<data_2>
   <%= From r In table.Rows _
       Select <data_2a>
        private <fieldtype><%= r(1) %></fieldtype><%= " " %>
                <fieldname><%= r(0) %></fieldname>;</data_2a> %>
</data_2>

In this section of code, you will note the XML tag <fieldtype> that was discussed in the previous paragraph and is used as the input parameter to the C# method. To be a little more specific, it is the value of the tag that is passed to the method. In this case, that means r(1) which equates to the value of the column 1 of the current row of the DataTable.

In the event you use a C# method to output text which contains characters which are substituted as in the case of an ampersand '&’, greater than '>’, etc., you will need to add an attribute to the <xsl:value> tag as in the case below. This will prevent the character from being replaced with its HTML equivalent.

XML
<xsl:for-each select="data_6/data_6a">
   <xsl:value-of select="user:CreateProperty(fieldtype, fieldname)" 
                 disable-output-escaping="yes"/>
</xsl:for-each>

After making the final changes, you are ready to run it for the final time. If all is gone well, you should see the output in the format that you work hoping for.

Demo 3 – Calling an External DLL

This next demo came about while I was finalizing the code for this article. A co-worker came to me with a problem. The problem was that code needed to be generated at runtime, compiled, and used. The other portion of the requirement was that the code may need to change again. This meant the introduction of an AppDomain. Those of you who have used an AppDomain before are probably cringing right about now.

After a few questions, I realized that it would fit within the parameters of this article with one additional need. The need to call an external DLL that already exists. Everything that we have learned through Demo 1 and 2 will be applied here.

This code below demonstrates the parts required to implement the solution:

C#
<msxsl:script language="C#" implements-prefix="user">
        
    <msxsl:assembly href="..\Some.Other.Library\Bin\Debug\Some.Other.Library.dll" />
    <msxsl:using namespace="Some.Other.Library.Test.Code"/>
        
    <![CDATA[
    public string Area(int length)
    {
        int square = length * length;

        Helper helper = new Helper();
        
        return (
            helper.Print("The Length Of A Square Is:", length) + "\n" + 
            helper.Print("The Area Of A Square Is:  ", square)
            );
    }
    ]]>

</msxsl:script>

The <msxsl:assembly> statement has two attributes: name and href. Only one can be used at a time. I have chosen to use href as it allows me to include a path to the DLL. The attribute name allows just the name, and assumes the GAC or the current directory. The thing to remember is that the path to the DLL is in reference to the location of the XSL file.

The <msxsl:using> statement defines the namespace(s) that you want to use, and is defined in the DLL. In the code example above, the class Helper is defined in another DLL, and has a method called Print.

As for one final requirement from my colleague, it had to be done in .NET v1.1. This proved to be a bit of a problem. The XslCompiledTransform class didn’t exist in .NET v1.1. The XslCompiledTransform class replaced XslTransform. Much to my dismay and a large amount of time, I was unable to provide a solution that would work in .NET v1.1 using the XslTransform class.

For the time being, this demo will have to be laid to rest with this one exception of having to be .NET v2.0 and up.

Demo 4 – Create XML Invoice Statement

The last demo leads us to a moderately complex implementation that is the most complex to implement. While this sound likes a contradiction, the explanation will become clear in the end if you follow along and perform the steps that I am taking to solve this problem.

This is not for the faint at heart. This will take determination and a lot of patience. To that end, I will explain the high level steps I took to implement this example, and leave it to the reader to examine the code more thoroughly. I will also limit my responses to questions, as there is no way to provide answers to personal projects.

1. Build the Excel Template Documents

As in the beginning of this article, we begin with the end in mind.

Create the Excel document using Office 2007. The Excel document should contain everything that you would want if you were going to use it for a real document. This includes formatting, text, formulas, etc. Save the document as an .XLSX file type.

The template that I created is in the “Sample Docs” directory entitled Invoice.xlsx.

The next step is to make a copy of the Excel document, open the document, and fill in the fields as if you were producing a real file. The fields should be filled in, in the order that makes most sense.

To make things a little easier, I will use the following names for the template files:

  1. Invoice.xlsx – This is the template used to create real documents.
  2. DataFile.xlsx – This template is used to determine what will have to change in order to make Invoice.xlsx parameterized.

2. Crack Open the Excel Documents

Microsoft Office 2007 is using a new file format that is standardized. I will explain some of what I have learned, but you can read more about it here. The very brief overview is that it is a standard .ZIP file that contains directories and .XML files.

Here is some information on the file formatting: MSDN, but to summarize it:

c = Cell 
r = Cell Address 
s = Style Information 
t = Text   "s" = string 
v = Value 
f = Function

Go ahead and change the extension of the template files that you created to .ZIP, or add it to the end. Using your favorite ZIP extraction tool, extract the contents of both templates.

You should end up with the following directories and .XML file:

07/12/2008  09:38 PM    <DIR>          .
07/12/2008  09:38 PM    <DIR>          ..
07/12/2008  09:38 PM    <DIR>          docProps
07/12/2008  09:38 PM    <DIR>          xl
01/01/1980  12:00 AM             1,582 [Content_Types].xml
07/12/2008  09:38 PM    <DIR>          _rels

3. Find the Differences

Using your favorite tool to diff files, compare all the files and folders between your two templates.

The two files in my demo that I needed to work with are xl\sharedStrings.xml and xl\worksheets\sheet1.xml. The other file I will need to use is xl\styles.xml.

4. Start Making Changes

  1. Start by copying the DataFile\xl\styles.xml file over Invoice\xl\styles.xml. I found through a lot of trial and error that I need to copy this file and not bother with it any further.
  2. Create the xl\sharedStrings.xml file. Start by copying the contents of Invoice\xl\sharedStrings.xml into a new VB.NET module that takes a DataSet. For my example, I need to add in the client information of name, address, etc., and then the hours that I worked. I do this through the use of an XML Literal like we have done before, using the data from the DataSet.
  3. The strings in this file are indexed. The new strings should be laid out in an order that makes sense, and can be referenced in order later. If you create a sample DataSet and fill in the values you used to create DataFile.xlsx, then you can use your diff tool to see how close you are to producing a mirror image of the file. As a final step, there are some attributes that you will have to change in the XML root node. The demo has the fields you will need to fill in.

    Save the file to Invoice\xl\styles.xml.

  4. Create the xl\worksheets\sheet1.xml file. Once again, start by copying the contents of the original file into a new VB.NET module. The difference this time is we will be using DataFile\xl\worksheets\sheet1.xml to copy the data from. The reason for this should become apparent if you diff the Invoice file against the DataFile file. There are plenty of formatting parameters that are needed. It is easier to start with this file and use the values as parameters to custom methods.
  5. These specialty methods need to be embedded within the XML Literal. This is at the point that I really like XML Literals. After finding and replacing the values you need to produce a fully functional Excel document, it still looks clean.

    Save the file to Invoice\xl\worksheets\sheet1.xml.

5. Zip up the Files and Folders

In order to test and make sure that all has gone well, you will need to zip up the files and folders. You can then rename the ZIP file to have the original extension of XLSX. At this point, you should be able to open the file in Excel and see your data filled in.

pic_1.jpg

There is one point of caution when zipping up the file; you can not include the top folder. If you attempt to zip it up as shown in the following picture:

pic_2.jpg

You will get an error:

pic_3.jpg

If you click “Yes”, you get:

pic_4.jpg

To resolve this issue, you will have to navigate into the Invoice directory, press CTRL-A or select all files, and then zip up the files.

pic_5.jpg

6. Rinse and Repeat

The chance of you getting past this step successfully, the first dozen times for your own project, is probably unlikely. You will need to repeat Steps 3 and 4 ad nauseam until all issues are resolved. There is no magic formula at this point other than diff, fix, test, then repeat.

7. Final Thoughts and Issues

The best way to begin to understand how everything fits together is to set breakpoints along the way and step through with debugger. Understanding how the first three demos work will provide the basics that can be reused in this demo. Once you understand the basics, you can focus on understanding the intricacies of the Open Office XML specification.

The only issue that I found when working with the Excel XML specification was the unusual date format. When working with dates, you will need to convert it to an integer. The easiest way to do this is to use the following method:

VB
Public Function ConvertDate(ByVal strDate As String) As Int32
    Dim newDate As DateTime = DateTime.Parse(strDate)
    Return newDate.ToOADate()
End Function

Conclusion

Hopefully, you learned something of value in this article, and can apply it to generate code, documents, or something new altogether.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)