Introduction
In the article WordML Templates Editor – data visualization using Office 2003 WordML features, we have tackled the data visualization issue using Word ML and Word DOM approaches. This article raises the same problem, but in an MS Excel way. It is just a simple "template" editor which allows users to design a visualization file with data gathered from their databases. The sample is using the same information from Northwind database: company, order, order details, and the list of products. The application uses a standard Excel feature to expose fields, repeating attributes and formatting specifiers: the Name
object (and the Names
collection). In SpreadsheetML schema, the alternatives are the ss:Names
and ss:NamedRange
elements.
Template creation
As in the Word template creation case, the editor offers access to data tables and data fields, it allows repeat marks insertion (for tables with many records) and formatting variables specifying. The code is very similar to the Word Templates Editor, except the Excel API usage (Workbooks
, Sheets
, ActiveWorkbook
, ActiveCell
, Names
etc.). The Excel application is hosted into a user control (excelCtl
) which is located on the frmExcelControl
form.
You can manage template files in standard XLS or in XML formats (create new, open, save). You can insert objects needed to create the template into the currently opened Excel workbook:
- fields, as internal names and as formula in the active cell:
int tableIndex = cboTables.SelectedIndex;
if(tableIndex >= 0 && fieldIndex >= 0 &&
excelCtl.Application != null &&
excelCtl.Application.Workbooks.Count > 0
&& excelCtl.Application.ActiveWorkbook != null)
{
item = listFields.Items[fieldIndex].ToString();
if(item.Trim() != string.Empty)
{
excelCtl.Application.ActiveWorkbook.Names.Add(
cboTables.Items[tableIndex].ToString() + item,
cboTables.Items[tableIndex].ToString() + item,
missing, missing, missing, missing, missing,
missing, missing, missing, missing);
excelCtl.Application.ActiveCell.Formula =
"=" + cboTables.Items[tableIndex].ToString() + item;
}
}
- repeat marks, as internal names and as formula in the active cell:
int tableIndex = cboTables.SelectedIndex;
int fieldIndex = listFields.SelectedIndex;
if(tableIndex >= 0 && excelCtl.Application != null &&
excelCtl.Application.Workbooks.Count > 0 &&
excelCtl.Application.ActiveWorkbook != null)
{
string tableName = cboTables.Items[tableIndex].ToString();
object oIndex = htTablesRepeatMarksIndexes[tableName];
int index = 0;
if(oIndex != null)
{
index = int.Parse(oIndex.ToString());
}
index++;
htTablesRepeatMarksIndexes[tableName] = index;
excelCtl.Application.ActiveWorkbook.Names.Add(tableName +
repeatAttribute + index.ToString(), tableName +
repeatAttribute + index.ToString(), missing,
missing, missing, missing, missing,
missing, missing, missing, missing);
excelCtl.Application.ActiveCell.Formula = "=" +
tableName + repeatAttribute + index.ToString();
LoadRepeatMarks();
}
- formatting specifiers, only as internal names:
int languageIndex = cboLanguages.SelectedIndex;
if(languageIndex >= 0 && excelCtl.Application != null &&
excelCtl.Application.Workbooks.Count > 0 &&
excelCtl.Application.ActiveWorkbook != null)
{
string languageName =
cboLanguages.Items[languageIndex].ToString();
object oVariableValue;
object oVariableName;
Excel.Name var;
oVariableValue = (object)languageName;
oVariableName = (object)"LanguageName";
try
{
var =
excelCtl.Application.ActiveWorkbook.Names.Item(
oVariableName, missing, missing);
}
catch
{
var = null;
}
if(var != null)
var.Value = languageName;
else
excelCtl.Application.ActiveWorkbook.Names.Add(
"LanguageName", languageName, missing, missing,
missing, missing, missing, missing,
missing, missing, missing);
}
Template visualization and CExcelMLFiller class
The class usage is similar to the Word CExWordMLFiller
class:
CExcelMLFiller filler = new CExcelMLFiller(dsData,
xmlTemplateDoc.OuterXml);
if(!filler.OperationFailed)
{
filler.Transform();
if(filler.OperationFailed)
{
foreach(string err in filler.ErrorList)
{
MessageBox.Show(err, "Error",
MessageBoxButtons.OK, MessageBoxIcon.Error);
}
return;
}
string copyFileName = Path.GetTempFileName() + ".xml";
filler.ExcelMLDocument.Save(copyFileName);
Process.Start(copyFileName);
}
else
{
foreach(string err in filler.ErrorList)
{
MessageBox.Show(err, "Error",
MessageBoxButtons.OK,
MessageBoxIcon.Error);
}
}
First, the class load template content and sets formatting information (stored in internal names), and this is done in the constructor. Next, the
Transform
method replace formulas in the corresponding cells configured with data fields with the data in the dataset, identifying records which are repeated (for the data tables which contains many rows). In the ExcelML file, only the records which have non-empty content are added, so every row in the sheet will be a template row. The template rows collection is identified and then, for each data table, the rows are filled with the data and/or multiplied. After the fill operation completion, the template rows are removed:
try
{
tableNode = xmlTemplateDoc.SelectSingleNode(
"/ss:Workbook/ss:Worksheet/ss:Table", nsmgr);
templateRowsColl = xmlTemplateDoc.SelectNodes(
"/ss:Workbook/ss:Worksheet/ss:Table/ss:Row", nsmgr);
foreach(DataTable dt in dsData.Tables)
{
TransformTemplateRows(dt);
}
int i = templateRowsColl.Count - 1;
while(i >= 0)
{
XmlNode repeatNode =
templateRowsColl[i].SelectSingleNode(
"ss:Cell[contains(@ss:Formula, '" +
repeatAttribute + "')]", nsmgr);
bool bIsRepeat = repeatNode != null;
if(bIsRepeat)
{
tableNode.RemoveChild(templateRowsColl[i]);
}
i--;
}
((XmlElement)tableNode).RemoveAttribute("ss:ExpandedRowCount");
}
catch(Exception ex)
{
while(ex != null)
{
errorList.Add(ex.Message);
ex = ex.InnerException;
}
bOperationFailed = true;
}
The
TransformTemplateRows
receives a
DataTable
object as parameter and, for each template row, checks if the row is a repeated or a non-repeated one. If it is repeated, the row is multiplied for every
DataRow
in the data table, otherwise it is just filled with data:
foreach(XmlNode rowNode in templateRowsColl)
{
bool bIsRepeat;
XmlNode repeatNode =
rowNode.SelectSingleNode("ss:Cell[contains" +
"(@ss:Formula, '=" + dt.TableName +
repeatAttribute + "')]", nsmgr);
bIsRepeat = repeatNode != null;
string templateFieldName;
if(bIsRepeat)
{
tableNode = rowNode.ParentNode;
foreach(DataRow dr in dt.Rows)
{
XmlNode newRowNode = rowNode.Clone();
((XmlElement)newRowNode).RemoveAttribute("ss:Index");
repeatNode = newRowNode.SelectSingleNode(
"ss:Cell[contains(@ss:Formula, '=" +
dt.TableName + repeatAttribute + "')]", nsmgr);
((XmlElement)repeatNode).RemoveAttribute("ss:Formula");
repeatNode.FirstChild.InnerText = string.Empty;
tableNode.InsertBefore(newRowNode, rowNode);
for(int i = 0; i < dr.ItemArray.Length; i++)
{
templateFieldName = dt.TableName +
dt.Columns[i].ColumnName;
ReplaceFieldData(newRowNode,
templateFieldName, dr[i].ToString(),
dt.Columns[i].DataType);
}
}
}
else
{
if(dt.Rows.Count > 0)
{
DataRow firstRow = dt.Rows[0];
for(int i = 0; i < firstRow.ItemArray.Length; i++)
{
templateFieldName = dt.TableName +
dt.Columns[i].ColumnName;
ReplaceFieldData(rowNode, templateFieldName,
firstRow[i].ToString(), dt.Columns[i].DataType);
}
}
}
}
The
ReplaceFieldData
adds data into a row node and tries to format it, if the formatting information is available:
errorList = new ArrayList();
bOperationFailed = false;
XmlNode dataNode;
XmlNodeList oColl;
oColl = baseNode.SelectNodes("ss:Cell[@ss:Formula='=" +
fieldName + "']", nsmgr);
foreach(XmlNode fieldNode in oColl)
{
dataNode = fieldNode.SelectSingleNode("ss:Data", nsmgr);
if(dataNode == null)
{
errorList.Add("The field data is selected " +
"from the fields definition data source " +
"or merge document is corrupted!");
bOperationFailed = true;
return ;
}
((XmlElement)fieldNode).RemoveAttribute("ss:Formula");
if(colType == typeof(DateTime))
{
if(dateTimeFormat != null)
{
DateTime dt = DateTime.Parse(data);
dataNode.InnerText = dt.ToString(dateTimeFormat);
}
else
{
dataNode.InnerText = data;
}
}
else if(colType == typeof(int)
|| colType == typeof(short)
|| colType == typeof(long)
)
{
if(numberFormat != null)
{
int i = int.Parse(data);
dataNode.InnerText = i.ToString(numberFormat);
}
else
{
dataNode.InnerText = data;
}
((XmlElement)dataNode).SetAttribute("ss:Type", "Number");
}
else if(colType == typeof(decimal)
|| colType == typeof(float)
|| colType == typeof(double)
)
{
if(numberFormat != null)
{
decimal d = decimal.Parse(data);
dataNode.InnerText = d.ToString("N", numberFormat);
}
else
{
dataNode.InnerText = data;
}
((XmlElement)dataNode).SetAttribute("ss:Type", "Number");
}
else
{
dataNode.InnerText = data;
((XmlElement)dataNode).SetAttribute("ss:Type", "String");
}
}
Template visualization and the CExcelXLSFiller class
The code used to instantiate and to use this class is similar to the CWordDOCFiller
class:
string templateFileName = Application.StartupPath + @"\Templates\Order.xls";
string copyFileName = Path.GetTempFileName() + ".xls";
File.Copy(templateFileName, copyFileName, true);
CExcelXLSFiller filler = new CExcelXLSFiller(dsData, copyFileName);
if(!filler.OperationFailed)
{
filler.Transform();
if(filler.OperationFailed)
{
foreach(string err in filler.ErrorList)
{
MessageBox.Show(err, "Error",
MessageBoxButtons.OK, MessageBoxIcon.Error);
}
return;
}
}
else
{
foreach(string err in filler.ErrorList)
{
MessageBox.Show(err, "Error",
MessageBoxButtons.OK, MessageBoxIcon.Error);
}
}
In the constructor, the same operations (Excel application instantiation, template copy opening, formatting information loading) are available.
The
Tranform
method gets a reference to the first sheet in the workbook, and sets initial completed range in the sheet, as well as the initial formula array of the range. The range can be modified during the load process by multiplying template rows. We have to loop through the rows and cells and to check if the range has changed. This method calls two other methods:
TransformRepeatRows
for rows which have to be multiplied and
TransformRow
for one only row:
try
{
oSheet = (Excel.Worksheet)oWorkbook.Worksheets[1];
Excel.Range rng =
(Excel.Range)oSheet.get_Range("A1", missing);
rng = rng.SpecialCells(
Excel.XlCellType.xlCellTypeLastCell, missing);
string address = rng.get_Address(false, false,
Excel.XlReferenceStyle.xlA1, missing, missing);
Excel.Range oRng = oSheet.get_Range("A1", address);
Excel.Range rowRng;
int rowCount = oRng.Rows.Count;
int colCount = oRng.Columns.Count;
string formula = string.Empty;
object[,] arrFormula = (object[,])oRng.Formula;
bool bIsRepeat = false;
int i = 1;
int indexRows = 0;
int indexRepeatFormula = -1;
while(i <= rowCount)
{
bIsRepeat = false;
for(int j = 1; j <= colCount; j++)
{
formula = arrFormula[i, j].ToString();
if(formula.IndexOf(repeatAttribute) != -1)
{
bIsRepeat = true;
indexRepeatFormula = j;
break;
}
}
if(bIsRepeat)
{
TransformRepeatRows(i, colCount, formula,
indexRepeatFormula, out indexRows);
i += indexRows;
rowRng = oSheet.get_Range("A" + i.ToString(),
missing).EntireRow;
rowRng.Delete(Excel.XlDeleteShiftDirection.xlShiftUp);
}
else
{
TransformRow(i, colCount);
i++;
}
if(rowCount != oRng.Rows.Count)
{
rowCount = oRng.Rows.Count;
arrFormula = (object[,])oRng.Formula;
}
}
oApp.Visible = true;
}
catch(Exception ex)
{
while(ex != null)
{
errorList.Add(ex.Message);
ex = ex.InnerException;
}
bOperationFailed = true;
}
A range in the sheet is found using
get_Range
method of the
oSheet
object. An address is obtained using the
get_Address
method of the Range object and the last typed cell is found using
SpecialCells
method of the Range object. The
TransformRepeatRows
method multiplies specified template row and add data from the corresponding table. The reference for an entire row in a range is given by
EntireRow
property. The template row is copied using a simple insertion first, and next using the
Copy
method:
indexRows = 0;
string sName = formula.Replace("=", string.Empty);
sName = sName.Substring(0, sName.IndexOf(repeatAttribute));
sName.Replace(repeatAttribute, string.Empty);
Excel.Range oNewRow, oCell;
int newRowIndex = rowIndex;
Excel.Range oRow = oSheet.get_Range("A" +
newRowIndex.ToString(), missing);
char colLetter;
foreach(DataTable dt in dsData.Tables)
{
if(sName.StartsWith(dt.TableName))
{
sName = sName.Replace(dt.TableName, string.Empty);
foreach(DataRow dr in dt.Rows)
{
indexRows++;
oRow = oSheet.get_Range("A" +
newRowIndex.ToString(),
missing).EntireRow;
oNewRow = oSheet.get_Range("A" +
(newRowIndex + 1).ToString(),
missing).EntireRow;
oNewRow.Insert(
Excel.XlInsertShiftDirection.xlShiftDown, missing);
oRow.Copy(oSheet.get_Range("A" +
(newRowIndex + 1).ToString(), missing));
colLetter = (char)(65 + indexRepeatFormula - 1);
oCell = oSheet.get_Range(colLetter.ToString() +
newRowIndex.ToString(), missing);
oCell.Formula = string.Empty;
for(int j = 1; j <= colCount; j++)
{
if(j != indexRepeatFormula)
{
ReplaceFieldData(newRowIndex, j, dr);
}
}
newRowIndex++;
}
break;
}
}
The
TransformRow
method is used to fill a template row with data extracted from the first row in a
DataTable
object:
DataRow firstRow;
foreach(DataTable dt in dsData.Tables)
{
if(dt.Rows.Count > 0)
{
firstRow = dt.Rows[0];
for(int j = 1; j <= colCount; j++)
{
ReplaceFieldData(rowIndex, j, firstRow);
}
}
}
The
ReplaceFieldData
method adds data in the cells contained into a row, using formatting information if available:
char colLetter = (char)(65 + colIndex - 1);
Excel.Range oCell = oSheet.get_Range(colLetter.ToString() +
rowIndex.ToString(), missing);
string fieldName = oCell.Formula.ToString();
string tableName = dr.Table.TableName;
Type colType = System.Type.Missing.GetType();
string data;
if(fieldName.Trim() != string.Empty &&
fieldName.StartsWith("=" + tableName))
{
fieldName = fieldName.Replace("=", string.Empty);
fieldName = fieldName.Substring(
fieldName.IndexOf(tableName) +
tableName.Length);
oCell.Formula = string.Empty;
data = dr[fieldName].ToString();
colType = dr.Table.Columns[fieldName].DataType;
if(colType == typeof(DateTime))
{
if(dateTimeFormat != null)
{
DateTime dt = DateTime.Parse(data);
oCell.Value2 = dt.ToString(dateTimeFormat);
}
else
{
oCell.Value2 = data;
}
}
else if(colType == typeof(int)
|| colType == typeof(short)
|| colType == typeof(long)
)
{
if(numberFormat != null)
{
int i = int.Parse(data);
oCell.Value2 = i.ToString(numberFormat);
}
else
{
oCell.Value2 = data;
}
}
else if(colType == typeof(decimal)
|| colType == typeof(float)
|| colType == typeof(double)
)
{
if(numberFormat != null)
{
decimal d = decimal.Parse(data);
oCell.Value2 = d.ToString("N", numberFormat);
}
else
{
oCell.Value2 = data;
}
}
else
{
oCell.Value2 = data;
}
}
Using the application
The solution ExcelDataSetTemplateEditor.root
contains four projects:
ExcelControl
– the user control which hosts the application;
- ExcelDataSetTemplateEditor – the main editor project;
NorthwindDA
– the data access component for the Northwind database;
- Test – the test application for a specified order and order details visualization, and for an alphabetical list of products.
The templates Order.xml and Order.xls in the Templates folder are the needed templates for the Test application. When the project is compiled, these templates are copied into a similar folder in the application startup path.
Conclusion
MS Office is developed in unmanaged code and it seems we won't have so soon a "managed" Office. The XML alternative helps programmers very much to have access to office documents. In the next Office 2007, the standard documents format has changed (.docx for Word and .xlsx for Excel). They will be just .zip archives which contain XML files, pictures and other binary files. The zip package contains parts which are related between them and which are stored in different folders. As expected, the XML format is WordML for Word documents and SpreadsheetML for Excel. The API provided by .NET framework 3.0 (in System.IO.Packaging
) is only related to the Package
and PackagePart
objects, responsible for archiving/de-archiving and for adding parts in the package. There is no API to access atomic elements, so a template consumer as explained in this article (using XML DOM and Office XML Schema) it would be a good approach to work with this format.