Latest versions [External links]
Introduction
Just like any other Excel XML export article/library writer, my quest started when I wanted to export a simple report to Excel. I knew that writing an Excel XML library is very easy, and I was sure I will find plenty of such libraries on the Internet. I did find many, but none of them were simple, yet intuitive, powerful, and implemented everything I needed. I could take ages to write a background component like an export library, but when using it, I wanted it to do my work fast and without fuss. The library should also be compatible with .NET 2.0/3.0/3.5.
So, I came with this library which is easy and fast to use, yet is very powerful to be considered to be one of the most powerful XML export libraries present. OK, I will count it as a shameless self promotion.
P.S.: The XML file format is only supported by Excel versions XP, 2003, and 2007. Previous versions, i.e., Excel 97 and Excel 2000 do not support this feature.
Features
There are a multitude of features which are present in the library. They are...
- Full import and export of Excel XML files
- Access cells from anywhere including
ExcelXmlWorkbook
, Worksheet
, Row
, and Cell
- Full support for formulae and ranges
- Imported files' formulae and ranges are parsed to
Formula
and Range
instances - Multitude of Insert, Delete, and Add functions available for
ExcelXmlWorkbook
, Worksheet
, Row
, and Cell
- Auto management of cell references in the book on any change made via any of the Insert, Delete, and Add functions
- Full support for styles in
Worksheet
, Row
, and Cell
, and even Range
- No instance declaration; all the dirty work is done by the library
- Cell ContentType, which enables you to know what the cell actually contains
DataSet
to ExcelXmlWorkbook
conversion facility- Absolute and non-absolute ranges and Named ranges
- Auto filter
- Very fast as everything is done via
XmlWriter
- Ability to freeze rows or columns, or both
- Almost all print options
- Almost all style options
- Many formatting options
- Hidden columns and rows
Using the library
Using the code is very easy. This was the primary concern when I was building this library. The primary or top level class is ExcelXmlWorkbook
which contains multiple Worksheet
s. The library resides in Yogesh.Extensions.ExcelXml
. The following example shows the various ways of adding cells in a Workbook, right from creating a instance.
ExcelXmlWorkbook book = new ExcelXmlWorkbook();
book.Properties.Author = "Yogesh Jagota";
Worksheet sheet = book[0];
sheet.Name = "AgewiseOutstanding";
sheet.FreezeTopRows = 3;
sheet.PrintOptions.Orientation = PageOrientation.Landscape;
sheet.PrintOptions.SetMargins(0.5, 0.4, 0.5, 0.4);
sheet[0, 0].Value = "Outstanding as on " + DateTime.Now;
sheet[0, 1].Value = "Name of Party";
sheet[1, 1].Value = "RSM";
sheet[2, 1].Value = "ASM";
sheet[3, 1].Value = "0-30";
sheet[4, 1].Value = "31-60";
sheet[5, 1].Value = "61-90";
sheet[6, 1].Value = "91+";
sheet[0, 2].Value = "M/s Stupid Paymaster";
sheet[1, 2].Value = "Mr. Nonsense";
sheet[2, 2].Value = "Mr. More Nonsense";
sheet[3, 2].Value = 0;
sheet[4, 2].Value = 5000;
sheet[5, 2].Value = 45000;
sheet[6, 2].Value = 0;
sheet[0, 3].Value = "M/s Good Paymaster";
sheet[1, 3].Value = "Mr. Good RSM";
sheet[2, 3].Value = "Mr. Good ASM";
sheet[3, 3].Value = 32000;
sheet[4, 3].Value = 0;
sheet[5, 3].Value = 0;
sheet[6, 3].Value = 0;
sheet[7, 3].Value = sheet[6, 3];
string outputFile = "Outstanding File.xml";
book.Export(outputFile);
Importing a file
To import a file, you can either supply a file name or a Stream
object to the static ExcelXmlWorkbook.Import
method which returns a ExcelXmlWorkbook
instance loaded with the file. If any error occurs, the function simply returns null
, and there is no way to find out what error occurred. So, the preferred way to import is to pass a Stream
because it gives you more control with error management.
Exporting the file
All the code is written to disk only when the Export
function is called. Export
must be supplied with either a Stream
or a file name. If any error occurs, the function simply returns false
, and there is no way to find out what error occurred. So, the preferred way to export is to pass a Stream
because it gives you more control with error management.
Assigning values to cells
Notice the last assignment in the previous example:
sheet[7, 3].Value = sheet[6, 2];
Here, we are actually assigning a cell to a cell. What will be the value of the cell, you might wonder? The cell will not have a value at all. It will have a reference to the assigned cell, something like this when you will open the file in Excel: =G3. It won't be an absolute reference, more on that later. We can assign these values to a cell:
string
bool
- All integar types, i.e.,
byte
, sbyte
, int
, uint
, long
, ulong
, float
, double
, decimal
. DateTime
Cell
Formula
[more on this below]
Knowing the type of content a cell contains
Every Cell
contains a ContentType
readonly field which can be used to check what value type the cell contains. The available values are String
, Number
, Boolean
, DateTime
, Formula
, UnresolvedValue
.
Retrieving values from cells
A readonly property GetValue<T>
returns the cell value converted to the type supplied. You can use ContentType
with GetValue<T>
to retrieve the exact value of a cell. GetValue<T>
enables strict type checking when retrieving a cell value. Further, if the type supplied with GetValue<T>
does not match the type of the cell type, default(T)
is returned instead. For example, if a cell's ContentType == ContentType.Numeric
, the only way to retrieve the value of the cell is to supply a byte
, sbyte
, int
, uint
, long
, ulong
, float
, or double
. If a cell's ContentType == ContentType.String
, the only way to retrieve the value of the cell is to supply a string
etc.
Various ways of accessing the cells
There is no hard coded way of accessing a particular cell. There are numerous ways of doing so. For example, the fourth column of the second row in the last example can be set to a value of 1 by...
- Directly using the
ExcelXmlWorkbook
class.
book[0][3, 1].Value = 1
- Using the
Worksheet
class.
Worksheet sheet = book[0];
sheet[3, 1].Value = 1
- Using the
Row
class.
Worksheet sheet = book[0];
Row row = sheet[1];
row[3].Value = 1
- Using the
Cell
class.
Worksheet sheet = book[0];
Row row = sheet[1];
Cell cell = row[3];
cell.Value = 1
Note that we do not need to declare a instance of a new worksheet, row, or cell. All the dirty work is done by the library. This style of coding opens many ways of accessing cells and rows.
Styles
All cells, rows, and worksheets have styles which can be set individually. These are Font
, Alignment
, Interior
, Border
, and DisplayFormat
. More information can be found in the documentation about members of the style classes. Changing a worksheet style setting affects all cells in the worksheet. A row setting affects all child cells in the row, and a single cell setting affects, well, that very cell. Example:
sheet[1, 3].Font.Bold = true;
All the functionality of a style is implemented in a class XmlStyle
. You can create an instance to XmlStyle
in your code and assign it to the Style
property which is present in all the cells, rows, and worksheets. Example:
XmlStyle style = new XmlStyle();
style.Font.Bold = true;
sheet[1, 3].Style = style;
Ranges
The main reason for writing my own implementation was ranges, which I found missing, or not having the powers which ranges should have. In this library, ranges are very powerful and extendible. Ranges have all the style elements found in cells, rows, and worksheets. Example:
Range range = new Range(sheet[0, 2], sheet[7, 2]);
range.Font.Bold = true;
Even this is valid code, although many might recommend doing it the first way...
new Range(sheet[0, 2], sheet[7, 2]).Font.Bold = true;
Please note that ranges can not be assigned to a cell value. Assigning it will generate an empty cell. A range can contain a single cell or a range of cells. In the above example, we are providing the constructor with the first cell and the last cell. Ranges always contain rectangular ranges, just like in Excel.
Applying auto filter to ranges
To apply auto filter to a range, you only need to call the range's AutoFilter
method and you are done. Example:
new Range(sheet[0, 1], sheet[6, 3]).AutoFilter();
Absolute and non-absolute ranges
By default, all ranges output a non-absolute reference. To set up an absolute reference, just set the Absolute
property of the range to true
.
Range range = new Range(sheet[0 ,2], sheet[7, 2]);
range.Font.Bold = true;
range.Absolute = true;
Functions
Now, we come to the real use of ranges and their Absolute
property: adding functions. I think a function in my library can be easily understood by this example which uses the first example in this article.
sheet[7, 3].Value = FormulaHelper.Formula("sum", new Range(sheet[3, 3], sheet[6, 3]));
or:
sheet[7, 3].Value = new Formula().Add("sum").StartGroup().Add(
new Range(sheet[3, 3], sheet[6, 3])).EndGroup();
When you will open this book in Excel, the value of the cell will be =SUM(D4:G4).
Function parameters
Here, we have added a single parameter in the formula constructor. You can add as many parameters as you want using the Add
function of the Formula
class. Only two types of parameters are allowed though, string
or Range
. The string
parameter type can be used to add any value and named ranges also (read more about named ranges in the documentation). Example:
Formula formula = new Formula().Add("sum").StartGroup();
formula.Add("D4").Operator(',');
formula.Add(new Range(sheet[4, 3]) { Absolute = true } ).Operator(',');
formula.Add(new Range(sheet[5, 3], Range(sheet[6, 3])).EndGroup();
sheet[7, 3].Value = formula;
When you will open this book in Excel, the value of the cell will be =SUM(D4, $E$4, F4:G4).
Filtering cells as parameters by checking the cell value or style
You can filter all cells and auto add them to the parameter list of a formula by passing a parameter, i.e., a delegate which accepts Cell
as its value and returns bool
to both the Formula
constructor or Add
. All the value accessors (i.e., Value
, IntValue
, etc.) and cell styles can be checked. Examples:
XmlStyle style = new XmlStyle();
style.Font.Bold = true;
sheet[7, 3].Value = FormulaHelper.Formula("sum",
new Range(sheet[0, 3], sheet[6, 3]),
cell => cell.Style == style);
sheet[7, 3].Value = FormulaHelper.Formula("sum",
new Range(sheet[0, 3], sheet[6, 3]),
delegate (Cell cell) { return cell.Style == style; } );
Example 2:
sheet[7, 3].Value = FormulaHelper.Formula("sum",
new Range(sheet[0, 3], sheet[6, 3]),
cell => cell.GetValue<int>() > 10000 &&
cell.GetValue<int>() <= 50000);
In the first example of style, the value of the cell will be =SUM(A4:C4, F4:G4). Continuous range matching to true will be joined as one parameter, i.e., A4:C4, and not three parameters, i.e., A4,B4,C4.
Modifying imported Excel XML files
Imported Excel XML files can be modified directly via direct assignment just like new files. Further to this, there are many functions which allow insertion, deletion, and addition of...
- One or multiple worksheets in books, e.g.,
InsertSheetBefore
, InsertSheetAfter
. - One or multiple rows and columns in sheets, e.g.,
InsertColumnAfter
, InsertColumnsAfter
, InsertRowBefore
, InsertRowsBefore
. - One or multiple cells in rows e.g.,
InsertCellBefore
, InsertCellsBefore
.
See the documentation for more on these functions.
Exporting a DataSet to an ExcelXmlWorksheet
A static member in ExcelXmlWorksheet
, DataSetToWorkbook
, is provided which converts a DataSet
and returns a Worksheet reference. All the tables are converted into different sheets of the Workbook.
Usage
ExcelXmlWorksheet sheet = ExcelXmlWorksheet.DataSetToWorkbook(sourceDataSet)
Cell collection
Cell collection is a strongly typed List
collection with full support for LINQ. You can use the Add
method to add a Worksheet
, Range
, Row
, or Cell
. You can add all cells, or you can filter the cells using a predicate.
Memory
Looking at all this code might make you think that all the cells, rows, Worksheets, ranges must be using too much memory. They must also be having their own separate copy of styles which will cause extra overhead. The answer is no.
I have optimized the library to use as little memory as possible. As far as the styles go, if you have a 100,000 cell Workbook written programmatically, which contains only 10 individual styles, the number of styles in memory will only be 11, i.e., 10 separate styles + 1 default style. Although the styles are added on a book level, if you have 10 books with the 10 same styles present in all of them, the number of style instances active in the program will be 110.
Conclusion
I will love to hear your comments and suggestions. Any bugs can be reported here.
Updates
Code breaking changes in v3.29
The SetHeaderFooterMargin
method in PrintOptions
has been removed. Use the HeaderMargin
and FooterMargin
properties instead.
Code breaking changes in v3.06
The formula system does not work the way it used to, so the previous code might break. For backwards compatibility, I have included a static class FormulaHelper
which can be used with the previous code. Just replace the previous code in the following way:
cell.Value = new Formula("Sum", new Range(sheet[3, 3], sheet[6, 3]));
with:
cell.Value = FormulaHelper.Formula("Sum", new Range(sheet[3, 3], sheet[6, 3]));
History
- [26 Nov 2008] Revision 3.35
- Fixed an error with the GUID type storage.
- Fixed an error with the format generation of
ContentType.Time
.
- [19 Nov 2008] Revision 3.33
- Added GUID cell content type.
- Now
GetValue<string>
converts almost everything to string type including the newly added GUID cell content type. - Fixed
SetHeader
and SetFooter
where empty strings caused issues. - Fixed error with the new
IsRawContent
property.
- [12 Nov 2008] Revision 3.29
- Added support for tab and comma delimited export.
- Fixed error with
Workbook
's Add(string sheetName)
method, which added two sheets instead of one. - Fixed error in
Worksheet
's ImportTable
method where column index was using an one based index. - Fixed error where merged cells spanned across multiple lines causing a Table error in Excel.
- Fixed error where
DateTime
values were saved in 12 hour format, but did not took AM/PM into account and always saved as AM. - Converted the
HeaderMargin
and FooterMargin
fields to properties, and removed the SetHeaderFooterMargin
method. - Added the
Header
and Footer
string properties. - Added multipart header and footer string helpers, i.e.,
SetHeader
and SetFooter
. - Added support for print gridlines (
EnableGridlines
property). - Reset many internal values to
private
and cleaned up the project. - Added currency display format.
- Fixed error where
GetValue
was not retrieving a decimal value. - Set the default display format of dates to
GeneralDate
.
- [23 Jul 2008] Revision 3.06
- New formula system implemented.
RowSpan
and ColumnSpan
properties added to cell.- Small error in named range rename fixed.
- Freeze column won't work if freeze row is set. Fixed.
- Style was not saved in Column export. Fixed.
- Assembly file contained wrong information about this library. Fixed.
- Workbook export can throw a
NullReferenceException
when new XmlWriter
is created. Fixed. - Documented the
IStyle
interfaces.
- [10 Jun 2008] Revision 2.89
- Added Print Area support.
- Added the
TabColor
property to Worksheet. - Added reference (
HRef
) support to cell. - Added support for custom display formats.
- Added support for cell patterns.
- Fixed error where "Long Date", "Short Date", "Time", and "@" as
DisplayFormat
caused an exception. - Fixed number format issue where some international formats were not saved properly.
- [03 Apr 2008] Revision 2.82
- Fixed single cell merge bug.
- [19 Mar 2008] Revision 2.81
- Fixed multi-Worksheet import bug.
- [10 Mar 2008] Revision 2.80
- Added decimal support.
- Fixed an error where assigning 0 to a cell still caused an exception.
- [06 Mar 2008] Revision 2.79
- Added cell merge/unmerge support.
- Added
GetEnumerator
support for sheets, rows, and ranges. - Added the
CellCollection
class. - Fixed an error where numeric output of the cell contained global number format where it should only be US only format. Thanks to Reinhard.
- Added six new display format types and removed the
Custom
format type. - Added
Index
property to cell which also has a ExcelColumnIndex
property which returns columns in Excel format, e.g., A, AA, AC, FA.
- [28 Feb 2008] Revision 2.45
- Fixed an error where
GetValue<T>
was not accepting string
and DateTime
types. Thanks to Karl for pointing this out.
- [21 Feb 2008] Revision 2.44
- Fixed a error where assigning 0 to a cell caused a exception. Thanks to Ralf for pointing this out.
- [20 Feb 2008] Revision 2.43
- [19 Jan 2008] Revision 1.30
- Added complete documentation.
- Added
DataSet
export capability. - Added range auto filter.
- Added print row and column headers
- Some name changes to remove warnings reported by FxCop. (
ExcelXmlWorkBook
to ExcelXmlWorkbook
, WorkSheet
to Worksheet
, and CellCompareDelegate
to CellCompare
. CellCompare
is now in Yogesh.Extensions.ExcelXml
and not in Yogesh.Extensions.ExcelXml.Formula
.- Added cell comments.
- [16 Jan 2008] Revision 1.0
- First release on The Code Project.