(untagged)

Fast Excel Import and Export

Jose Segarra

0.00/5 (No votes)

15 Oct 2014

Fast, really fast, import/export from Excel OOXML

Introduction

This tip presents code to access (read & write) data in an Excel sheet without using any external module, by directly accessing the data in the ooxml format. Also a great effort has been made to make it as fast as possible, minimizing the number of memory allocations, and making it as usable as possible.

You can find the most up to date code at https://github.com/jsegarra1971/SejExcelExport.

Background

There are different techniques to read/write data from an Excel sheet, some of them use the database paradigm (DAO, odbc,..), others use the Excel-like-objects path (worksheets, cells, ...), and others use Excel itself (COM, Interop,...). Each of these techniques has a number of libraries, hiding the dirty details, and making their use fairly simple.

These techniques have their pros and cons, but most of them either, require additional software (i.e.: Excel, Microsoft.ACE.OLEDB,...), usually this is not a big problem, as most of this software is open source, free or very cheap, and easily accessible.

Nevertheless and when it comes to do the job, most of these tools are quite "liberal" in their memory consumption. This usually it is not a problem for small sheets (less than 5,000 rows), but it can bring your system to its knees for big sheets (more than 50,000 rows).

Reading a Sheet

Reading a sheet is quite simple, you just need:

ExcelReader myReader=new ExcelReader("this_is_my_excel_file.xlsx");    // Open the file
myReader.Process(OnExcelCell);                                         // Process it

and to get the actual values:

?void OnExcelCell(char Column, int RowNumber, string value)
{
 if (Column=='#') Console.Write("Row: "+RowNumber);
 else Console.Write("Column: "+Column+" Value: "+value);
}

Currently, only the first sheet and columns up to 'Z' are processed. Nevertheless it is quite simple to overcome this limitations (see ExcelReader.cs).

Writing a Sheet

For writing data into a sheet, we need two things:

The data, which will come as an implementation of IDataReader.
A template, which will come as an Excel file with placeholders for the actual data. These placeholders are just names beginning with and asterik "*". The values from the data source are retrieved by matching this placeholder with the FIELDNAME.

Using this template:

ROW1 is fixed, and will be present in the output file
ROW2 will hold the fields "name","surname" and "age" of the first record of data
ROW3 will hold the fields "name","surname" and "age" of the second record of data

and then repeat until EOF.

ExcelWriter t = new ExcelWriter("Template.xlsx");   // This is the template
MySampleData data = new MySampleData();             // This is the IDataReader
t.Export(data,"output_file.xlsx");                  // Use the template to create the data file

Performance

It is hard to read/write an Excel file faster. Included there is a sample that generates an Excel file with 100,000 rows, and then reads all the 100,000 rows. Both operations done in less than 4 seconds, running on a rather outdated system (Windows 7 virtual machine on a 2009 iMac)

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here