ExcelML Templates Editor – data visualization using Office 2003 SpreadsheetML features

Dan Radu

4.65/5 (8 votes)

27 Oct 2006CPOL5 min read

1.7K

This article explains how to use the Office 2003 SpreadsheetML feature to display data stored in .NET datasets.

Introduction

In the article WordML Templates Editor – data visualization using Office 2003 WordML features, we have tackled the data visualization issue using Word ML and Word DOM approaches. This article raises the same problem, but in an MS Excel way. It is just a simple "template" editor which allows users to design a visualization file with data gathered from their databases. The sample is using the same information from Northwind database: company, order, order details, and the list of products. The application uses a standard Excel feature to expose fields, repeating attributes and formatting specifiers: the Name object (and the Names collection). In SpreadsheetML schema, the alternatives are the ss:Names and ss:NamedRange elements.

Template creation

As in the Word template creation case, the editor offers access to data tables and data fields, it allows repeat marks insertion (for tables with many records) and formatting variables specifying. The code is very similar to the Word Templates Editor, except the Excel API usage (Workbooks, Sheets, ActiveWorkbook, ActiveCell, Names etc.). The Excel application is hosted into a user control (excelCtl) which is located on the frmExcelControl form.

You can manage template files in standard XLS or in XML formats (create new, open, save). You can insert objects needed to create the template into the currently opened Excel workbook:

fields, as internal names and as formula in the active cell:

int tableIndex = cboTables.SelectedIndex;

if(tableIndex >= 0 && fieldIndex >= 0 && 
   excelCtl.Application != null && 
   excelCtl.Application.Workbooks.Count > 0 
   && excelCtl.Application.ActiveWorkbook != null)
{
    item = listFields.Items[fieldIndex].ToString();
    if(item.Trim() != string.Empty)
    {
        excelCtl.Application.ActiveWorkbook.Names.Add(
          cboTables.Items[tableIndex].ToString() + item, 
          cboTables.Items[tableIndex].ToString() + item, 
          missing, missing, missing, missing, missing, 
          missing, missing, missing, missing);
        excelCtl.Application.ActiveCell.Formula = 
          "=" + cboTables.Items[tableIndex].ToString() + item;
    }
}

repeat marks, as internal names and as formula in the active cell:

int tableIndex = cboTables.SelectedIndex;
int fieldIndex = listFields.SelectedIndex;
if(tableIndex >= 0 && excelCtl.Application != null && 
   excelCtl.Application.Workbooks.Count > 0 && 
   excelCtl.Application.ActiveWorkbook != null)
{
    string tableName = cboTables.Items[tableIndex].ToString();
    object oIndex = htTablesRepeatMarksIndexes[tableName];
    int index = 0;
    if(oIndex != null)
    {
        index = int.Parse(oIndex.ToString());
    }
    index++;
    htTablesRepeatMarksIndexes[tableName] = index;
    excelCtl.Application.ActiveWorkbook.Names.Add(tableName + 
             repeatAttribute + index.ToString(), tableName + 
             repeatAttribute + index.ToString(), missing, 
             missing, missing, missing, missing, 
             missing, missing, missing, missing);
    excelCtl.Application.ActiveCell.Formula = "=" + 
             tableName + repeatAttribute + index.ToString();
    LoadRepeatMarks();
}

formatting specifiers, only as internal names:

int languageIndex = cboLanguages.SelectedIndex;
if(languageIndex >= 0 && excelCtl.Application != null && 
   excelCtl.Application.Workbooks.Count > 0 && 
   excelCtl.Application.ActiveWorkbook != null)
{
    string languageName = 
      cboLanguages.Items[languageIndex].ToString();
    object oVariableValue;
    object oVariableName;
    Excel.Name var;

    oVariableValue = (object)languageName;
    oVariableName = (object)"LanguageName";
    try
    {
        var = 
          excelCtl.Application.ActiveWorkbook.Names.Item(
          oVariableName, missing, missing);
    }
    catch
    {
        var = null;
    }
    if(var != null)
        var.Value = languageName;
    else
        excelCtl.Application.ActiveWorkbook.Names.Add(
          "LanguageName", languageName, missing, missing, 
          missing, missing, missing, missing, 
          missing, missing, missing);
}

Template visualization and CExcelMLFiller class

The class usage is similar to the Word CExWordMLFiller class:

CExcelMLFiller filler = new CExcelMLFiller(dsData, 
                            xmlTemplateDoc.OuterXml);
if(!filler.OperationFailed)
{
    filler.Transform();
    if(filler.OperationFailed)
    {
        foreach(string err in filler.ErrorList)
        {
            MessageBox.Show(err, "Error", 
              MessageBoxButtons.OK, MessageBoxIcon.Error);
        }
        return;
    }
    string copyFileName = Path.GetTempFileName() + ".xml";
    filler.ExcelMLDocument.Save(copyFileName);
    Process.Start(copyFileName);
}
else
{
    foreach(string err in filler.ErrorList)
    {
        MessageBox.Show(err, "Error", 
                   MessageBoxButtons.OK, 
                   MessageBoxIcon.Error);
    }
}

First, the class load template content and sets formatting information (stored in internal names), and this is done in the constructor. Next, the Transform method replace formulas in the corresponding cells configured with data fields with the data in the dataset, identifying records which are repeated (for the data tables which contains many rows). In the ExcelML file, only the records which have non-empty content are added, so every row in the sheet will be a template row. The template rows collection is identified and then, for each data table, the rows are filled with the data and/or multiplied. After the fill operation completion, the template rows are removed:

try
{
    tableNode = xmlTemplateDoc.SelectSingleNode(
        "/ss:Workbook/ss:Worksheet/ss:Table", nsmgr);
    templateRowsColl = xmlTemplateDoc.SelectNodes(
        "/ss:Workbook/ss:Worksheet/ss:Table/ss:Row", nsmgr);
    foreach(DataTable dt in dsData.Tables)
    {
        TransformTemplateRows(dt);
    }
    int i = templateRowsColl.Count - 1;
    while(i >= 0)
    {
        XmlNode repeatNode = 
          templateRowsColl[i].SelectSingleNode(
          "ss:Cell[contains(@ss:Formula, '" + 
          repeatAttribute + "')]", nsmgr);
        bool bIsRepeat = repeatNode != null;
        if(bIsRepeat)
        {
            tableNode.RemoveChild(templateRowsColl[i]);
        }
        i--;
    }
    ((XmlElement)tableNode).RemoveAttribute("ss:ExpandedRowCount");
}
catch(Exception ex)
{
    while(ex != null)
    {
        errorList.Add(ex.Message);
        ex = ex.InnerException;
    }
    bOperationFailed = true;
}

The TransformTemplateRows receives a DataTable object as parameter and, for each template row, checks if the row is a repeated or a non-repeated one. If it is repeated, the row is multiplied for every DataRow in the data table, otherwise it is just filled with data:

foreach(XmlNode rowNode in templateRowsColl)
{
    bool bIsRepeat;
    XmlNode repeatNode = 
      rowNode.SelectSingleNode("ss:Cell[contains" + 
      "(@ss:Formula, '=" + dt.TableName + 
      repeatAttribute + "')]", nsmgr);
    bIsRepeat = repeatNode != null;
    string templateFieldName;
    if(bIsRepeat)
    {
        tableNode = rowNode.ParentNode;
        foreach(DataRow dr in dt.Rows)
        {
            XmlNode newRowNode = rowNode.Clone();
            ((XmlElement)newRowNode).RemoveAttribute("ss:Index");
            repeatNode = newRowNode.SelectSingleNode(
              "ss:Cell[contains(@ss:Formula, '=" + 
              dt.TableName + repeatAttribute + "')]", nsmgr);
            ((XmlElement)repeatNode).RemoveAttribute("ss:Formula");
            repeatNode.FirstChild.InnerText = string.Empty;
            tableNode.InsertBefore(newRowNode, rowNode);
            for(int i = 0; i < dr.ItemArray.Length; i++)
            {
                templateFieldName = dt.TableName + 
                  dt.Columns[i].ColumnName;
                ReplaceFieldData(newRowNode, 
                  templateFieldName, dr[i].ToString(), 
                  dt.Columns[i].DataType);
            }
        }
    }
    else
    {
        if(dt.Rows.Count > 0)
        {
            DataRow firstRow = dt.Rows[0];
            for(int i = 0; i < firstRow.ItemArray.Length; i++)
            {
                templateFieldName = dt.TableName + 
                         dt.Columns[i].ColumnName;
                ReplaceFieldData(rowNode, templateFieldName, 
                 firstRow[i].ToString(), dt.Columns[i].DataType);
            }
        }
    }
}

The ReplaceFieldData adds data into a row node and tries to format it, if the formatting information is available:

errorList = new ArrayList();
bOperationFailed = false;
XmlNode dataNode;
XmlNodeList oColl;
oColl = baseNode.SelectNodes("ss:Cell[@ss:Formula='=" + 
                              fieldName + "']", nsmgr);
foreach(XmlNode fieldNode in oColl)
{
    dataNode = fieldNode.SelectSingleNode("ss:Data", nsmgr);
    if(dataNode == null)
    {
        errorList.Add("The field data is selected " + 
          "from the fields definition data source " + 
          "or merge document is corrupted!");
        bOperationFailed = true;
        return ;
    }
    ((XmlElement)fieldNode).RemoveAttribute("ss:Formula");
    if(colType == typeof(DateTime))
    {
        if(dateTimeFormat != null)
        {
            DateTime dt = DateTime.Parse(data);
            dataNode.InnerText = dt.ToString(dateTimeFormat);
        }
        else
        {
            dataNode.InnerText = data;
        }
        //((XmlElement)dataNode).SetAttribute("ss:Type", "DateTime");
    }
    else if(colType == typeof(int)
        || colType == typeof(short)
        || colType == typeof(long)
        )
    {
        if(numberFormat != null)
        {
            int i = int.Parse(data);
            dataNode.InnerText = i.ToString(numberFormat);
        }
        else
        {
            dataNode.InnerText = data;
        }
        ((XmlElement)dataNode).SetAttribute("ss:Type", "Number");
    }
    else if(colType == typeof(decimal)
        || colType == typeof(float)
        || colType == typeof(double)
        )
    {
        if(numberFormat != null)
        {
            decimal d = decimal.Parse(data);
            dataNode.InnerText = d.ToString("N", numberFormat);
        }
        else
        {
            dataNode.InnerText = data;
        }
        ((XmlElement)dataNode).SetAttribute("ss:Type", "Number");
    }
    else
    {
        dataNode.InnerText = data;
        ((XmlElement)dataNode).SetAttribute("ss:Type", "String");
    }
}

Template visualization and the CExcelXLSFiller class

The code used to instantiate and to use this class is similar to the CWordDOCFiller class:

string templateFileName = Application.StartupPath + @"\Templates\Order.xls";
string copyFileName = Path.GetTempFileName() + ".xls";
File.Copy(templateFileName, copyFileName, true);

CExcelXLSFiller filler = new CExcelXLSFiller(dsData, copyFileName);
if(!filler.OperationFailed)
{
    filler.Transform();
    if(filler.OperationFailed)
    {
        foreach(string err in filler.ErrorList)
        {
            MessageBox.Show(err, "Error", 
              MessageBoxButtons.OK, MessageBoxIcon.Error);
        }
        return;
    }
}
else
{
    foreach(string err in filler.ErrorList)
    {
        MessageBox.Show(err, "Error", 
          MessageBoxButtons.OK, MessageBoxIcon.Error);
    }
}

In the constructor, the same operations (Excel application instantiation, template copy opening, formatting information loading) are available.
The Tranform method gets a reference to the first sheet in the workbook, and sets initial completed range in the sheet, as well as the initial formula array of the range. The range can be modified during the load process by multiplying template rows. We have to loop through the rows and cells and to check if the range has changed. This method calls two other methods: TransformRepeatRows for rows which have to be multiplied and TransformRow for one only row:

try
{
    oSheet = (Excel.Worksheet)oWorkbook.Worksheets[1];
    Excel.Range rng = 
      (Excel.Range)oSheet.get_Range("A1", missing);
    rng = rng.SpecialCells(
      Excel.XlCellType.xlCellTypeLastCell, missing);
    string address = rng.get_Address(false, false, 
      Excel.XlReferenceStyle.xlA1, missing, missing);
    Excel.Range oRng = oSheet.get_Range("A1", address);
    Excel.Range rowRng;
    int rowCount = oRng.Rows.Count;
    int colCount = oRng.Columns.Count;
    string formula = string.Empty;
    object[,] arrFormula = (object[,])oRng.Formula;
    bool bIsRepeat = false;
    int i = 1;
    int indexRows = 0;
    int indexRepeatFormula = -1;
    while(i <= rowCount)
    {
        bIsRepeat = false;
        for(int j = 1; j <= colCount; j++)
        {
            formula = arrFormula[i, j].ToString();
            if(formula.IndexOf(repeatAttribute) != -1)
            {
                bIsRepeat = true;
                indexRepeatFormula = j;
                break;
            }
        }
        if(bIsRepeat)
        {
            TransformRepeatRows(i, colCount, formula, 
                  indexRepeatFormula, out indexRows);
            i += indexRows;
            rowRng = oSheet.get_Range("A" + i.ToString(), 
                                      missing).EntireRow;
            rowRng.Delete(Excel.XlDeleteShiftDirection.xlShiftUp);
        }
        else
        {
            TransformRow(i, colCount);
            i++;
        }
        if(rowCount != oRng.Rows.Count)
        {
            rowCount = oRng.Rows.Count;
            arrFormula = (object[,])oRng.Formula;
        }
    }
    oApp.Visible = true;
}
catch(Exception ex)
{
    while(ex != null)
    {
        errorList.Add(ex.Message);
        ex = ex.InnerException;
    }
    bOperationFailed = true;
}

A range in the sheet is found using get_Range method of the oSheet object. An address is obtained using the get_Address method of the Range object and the last typed cell is found using SpecialCells method of the Range object. The TransformRepeatRows method multiplies specified template row and add data from the corresponding table. The reference for an entire row in a range is given by EntireRow property. The template row is copied using a simple insertion first, and next using the Copy method:

indexRows = 0;
string sName = formula.Replace("=", string.Empty);
sName = sName.Substring(0, sName.IndexOf(repeatAttribute));
sName.Replace(repeatAttribute, string.Empty);
Excel.Range oNewRow, oCell;
int newRowIndex = rowIndex;
Excel.Range oRow = oSheet.get_Range("A" + 
        newRowIndex.ToString(), missing);
char colLetter;
foreach(DataTable dt in dsData.Tables)
{
    if(sName.StartsWith(dt.TableName))
    {
        sName = sName.Replace(dt.TableName, string.Empty);
        foreach(DataRow dr in dt.Rows)
        {
            indexRows++;
            oRow = oSheet.get_Range("A" + 
                   newRowIndex.ToString(), 
                   missing).EntireRow;
            oNewRow = oSheet.get_Range("A" + 
                      (newRowIndex + 1).ToString(), 
                      missing).EntireRow;
            oNewRow.Insert(
              Excel.XlInsertShiftDirection.xlShiftDown, missing);
            oRow.Copy(oSheet.get_Range("A" + 
              (newRowIndex + 1).ToString(), missing));
            colLetter = (char)(65 + indexRepeatFormula - 1);
            oCell = oSheet.get_Range(colLetter.ToString() + 
              newRowIndex.ToString(), missing);
            oCell.Formula = string.Empty;

            for(int j = 1; j <= colCount; j++)
            {
                if(j != indexRepeatFormula)
                {
                    ReplaceFieldData(newRowIndex, j, dr);
                }
            }

            newRowIndex++;
        }
        break;
    }
}

The TransformRow method is used to fill a template row with data extracted from the first row in a DataTable object:

DataRow firstRow;
foreach(DataTable dt in dsData.Tables)
{
    if(dt.Rows.Count > 0)
    {
        firstRow = dt.Rows[0];
        for(int j = 1; j <= colCount; j++)
        {
            ReplaceFieldData(rowIndex, j, firstRow);
        }
    }
}

The ReplaceFieldData method adds data in the cells contained into a row, using formatting information if available:

char colLetter = (char)(65 + colIndex - 1);
Excel.Range oCell = oSheet.get_Range(colLetter.ToString() + 
                    rowIndex.ToString(), missing);
string fieldName = oCell.Formula.ToString();
string tableName = dr.Table.TableName;
Type colType = System.Type.Missing.GetType();
string data;
if(fieldName.Trim() != string.Empty && 
   fieldName.StartsWith("=" + tableName))
{
    fieldName = fieldName.Replace("=", string.Empty);
    fieldName = fieldName.Substring(
                  fieldName.IndexOf(tableName) + 
                  tableName.Length);
    oCell.Formula = string.Empty;
    data = dr[fieldName].ToString();
    colType = dr.Table.Columns[fieldName].DataType;
    if(colType == typeof(DateTime))
    {
        if(dateTimeFormat != null)
        {
            DateTime dt = DateTime.Parse(data);
            oCell.Value2 = dt.ToString(dateTimeFormat);
        }
        else
        {
            oCell.Value2 = data;
        }
    }
    else if(colType == typeof(int)
        || colType == typeof(short)
        || colType == typeof(long)
        )
    {
        if(numberFormat != null)
        {
            int i = int.Parse(data);
            oCell.Value2 = i.ToString(numberFormat);
        }
        else
        {
            oCell.Value2 = data;
        }
    }
    else if(colType == typeof(decimal)
        || colType == typeof(float)
        || colType == typeof(double)
        )
    {
        if(numberFormat != null)
        {
            decimal d = decimal.Parse(data);
            oCell.Value2 = d.ToString("N", numberFormat);
        }
        else
        {
            oCell.Value2 = data;
        }
    }
    else
    {
        oCell.Value2 = data;
    }
}

Using the application

The solution ExcelDataSetTemplateEditor.root contains four projects:

ExcelControl – the user control which hosts the application;
ExcelDataSetTemplateEditor – the main editor project;
NorthwindDA – the data access component for the Northwind database;
Test – the test application for a specified order and order details visualization, and for an alphabetical list of products.

The templates Order.xml and Order.xls in the Templates folder are the needed templates for the Test application. When the project is compiled, these templates are copied into a similar folder in the application startup path.

Conclusion

MS Office is developed in unmanaged code and it seems we won't have so soon a "managed" Office. The XML alternative helps programmers very much to have access to office documents. In the next Office 2007, the standard documents format has changed (.docx for Word and .xlsx for Excel). They will be just .zip archives which contain XML files, pictures and other binary files. The zip package contains parts which are related between them and which are stored in different folders. As expected, the XML format is WordML for Word documents and SpreadsheetML for Excel. The API provided by .NET framework 3.0 (in System.IO.Packaging) is only related to the Package and PackagePart objects, responsible for archiving/de-archiving and for adding parts in the package. There is no API to access atomic elements, so a template consumer as explained in this article (using XML DOM and Office XML Schema) it would be a good approach to work with this format.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)