Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / productivity / Office / MS-Excel

Export very large data to Excel file

4.87/5 (15 votes)
27 Sep 2013CPOL3 min read 303.4K   8.7K  
This code solves the problem when exporting very large data size to Excel file.

Introduction

This code solves the popular problem when creating a large Excel file with massive amounts of rows.

We faced this problem (and still are) when we tried to export data from a database and the data size was very large. Using any common third party library such as NOPI (http://npoi.codeplex.com), EPPlus (http://epplus.codeplex.com), or even using OpenXML (http://www.codeproject.com/Articles/371203/Creating-basic-Excel-workbook-with-Open-XML), we got stuck at OutofMemoryException exception.

Background

OpenXML is a standard and an open format for Office files. It defines the document structure and the elements for Office documents like .xlsx for Excel. I used here the Northwind database, but you can use your own database. You don't have to install Microsoft Office to create an Excel file with the OpenXML SDK.

Using the code

To be able to run this code, you need to install the OpenXML SDK 2.0 from Download Center. You need also to reference the WindowsBase assembly version 3.0 or 4.0 in this project (WindowsBase.dll). Please note that the WindowsBase assembly version 4.0 comes with .NET framework 4.5.

I developed this console application using Visual Studio 2012.

If you look at the Program class, you will find a method called ExportToOxml which is responsible for the creation of the Excel file.

The cause of the problem is we create the object that represents the Excel file in memory and we start to add cell data to it. Your application's consumed memory will start to increase and increase until it raises the OutOfMemoryException exception.

You will find out that I solved this problem by dividing the data that is coming from the database into groups of rows (it is configurable), and each group will be written to an individual sheet in the Excel workbook. I write the whole Excel object to the disk after adding each sheet to it and free the memory of the sheet data, then reopen the same Excel file again (without loading its data in memory), and add the next group of data in a new sheet, save it to the disk, and the process continues until all the data we have are written to the Excel file.

Please use the variable rowsPerSheet to control the number of rows per sheet that you want to produce. For example, I use 150000 value when I exported 700000 rows. if you are going to export more than Excel file at the same time (as in web sites/services), then choose smaller value of rowsPerSheet.

C#
rowsPerSheet  = 50000; 

You will notice that I did not load my data into the DataTable, as this will consume the application memory, but I used a DataReader which is fantastic in this case.

The project is for demonstration purposes only.

Please read my code comments inside the code, as it explains more about the logic and execution.

Do you have in your mind the question of the cell styles (font, colors, etc.)? I will update the article very soon with it.

C#
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Data;
using System.Data.SqlClient;
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
 
namespace ExportToExcel
{
    class Program
    {
        static private int rowsPerSheet = 100;
        static private DataTable ResultsData=new DataTable();
        static void Main(string[] args)
        {
            const string queryString = "SELECT OrderID, CustomerID FROM dbo.Orders;";
            using (var connection =
                   new SqlConnection(@"Data Source=localhost\SQL2008;" + 
                       @"Initial Catalog=Northwind;Integrated Security=SSPI"))
                {
                    var command =new SqlCommand(queryString, connection);
                    connection.Open();

                    SqlDataReader reader = command.ExecuteReader();
                    
                    int c = 0;
                    bool firstTime = true;

                    //Get the Columns names, types, this will help
                    //when we need to format the cells in the excel sheet.
                    DataTable dtSchema = reader.GetSchemaTable();
                    var listCols = new List<DataColumn>();
                    if (dtSchema != null)
                    {
                        foreach (DataRow drow in dtSchema.Rows)
                        {
                            string columnName = Convert.ToString(drow["ColumnName"]);
                            var column = new DataColumn(columnName, (Type)(drow["DataType"]));
                            column.Unique = (bool)drow["IsUnique"];
                            column.AllowDBNull = (bool)drow["AllowDBNull"];
                            column.AutoIncrement = (bool)drow["IsAutoIncrement"];
                            listCols.Add(column);
                            ResultsData.Columns.Add(column);
                        }
                    }

                    // Call Read before accessing data. 
                    while (reader.Read())
                    {
                        DataRow dataRow = ResultsData.NewRow();
                        for (int i = 0; i < listCols.Count; i++)
                        {
                            dataRow[(listCols[i])] = reader[i];
                        }
                        ResultsData.Rows.Add(dataRow);
                        c++;
                        if (c == rowsPerSheet)
                        {
                            c = 0;
                            ExportToOxml(firstTime);
                            ResultsData.Clear();
                            firstTime = false;
                        }
                    }
                    if (ResultsData.Rows.Count > 0)
                    {
                        ExportToOxml(firstTime);
                        ResultsData.Clear();
                    }
                    // Call Close when done reading.
                    reader.Close();
                }
        }
 
        private static void ExportToOxml(bool firstTime)
        {
            const string fileName = @"C:\MyExcel.xlsx";
 
            //Delete the file if it exists. 
            if (firstTime && File.Exists(fileName))
            {
                File.Delete(fileName);
            }
 
            uint sheetId = 1; //Start at the first sheet in the Excel workbook.
            
            if (firstTime)
            {
                //This is the first time of creating the excel file and the first sheet.
                // Create a spreadsheet document by supplying the filepath.
                // By default, AutoSave = true, Editable = true, and Type = xlsx.
                SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.
                    Create(fileName, SpreadsheetDocumentType.Workbook);
 
                // Add a WorkbookPart to the document.
                WorkbookPart workbookpart = spreadsheetDocument.AddWorkbookPart();
                workbookpart.Workbook = new Workbook();
 
                // Add a WorksheetPart to the WorkbookPart.
                var worksheetPart = workbookpart.AddNewPart<WorksheetPart>();
                var sheetData = new SheetData();
                worksheetPart.Worksheet = new Worksheet(sheetData);
 

                var bold1 = new Bold();
                CellFormat cf = new CellFormat();
 

                // Add Sheets to the Workbook.
                Sheets sheets;
                sheets = spreadsheetDocument.WorkbookPart.Workbook.
                    AppendChild<Sheets>(new Sheets());
 
                // Append a new worksheet and associate it with the workbook.
                var sheet = new Sheet()
                {
                    Id = spreadsheetDocument.WorkbookPart.
                        GetIdOfPart(worksheetPart),
                    SheetId = sheetId,
                    Name = "Sheet" + sheetId
                };
                sheets.Append(sheet);
 
                //Add Header Row.
                var headerRow = new Row();
                foreach (DataColumn column in ResultsData.Columns)
                {
                    var cell = new Cell { DataType = CellValues.String, 
                                     CellValue = new CellValue(column.ColumnName) };
                    headerRow.AppendChild(cell);
                }
                sheetData.AppendChild(headerRow);
 
                foreach (DataRow row in ResultsData.Rows)
                {
                    var newRow = new Row();
                    foreach (DataColumn col in ResultsData.Columns)
                    {
                        var cell = new Cell
                        {
                            DataType = CellValues.String,
                            CellValue = new CellValue(row[col].ToString())
                        };
                        newRow.AppendChild(cell);
                    }
 
                    sheetData.AppendChild(newRow);
                }
                workbookpart.Workbook.Save();
 
                spreadsheetDocument.Close();
            }
            else
            {
                // Open the Excel file that we created before, and start to add sheets to it.
                var spreadsheetDocument = SpreadsheetDocument.Open(fileName, true);
 
                var workbookpart = spreadsheetDocument.WorkbookPart;
                if (workbookpart.Workbook == null)
                    workbookpart.Workbook = new Workbook();
 
                var worksheetPart = workbookpart.AddNewPart<WorksheetPart>();
                var sheetData = new SheetData();
                worksheetPart.Worksheet = new Worksheet(sheetData);
                var sheets = spreadsheetDocument.WorkbookPart.Workbook.Sheets;
 
                if (sheets.Elements<Sheet>().Any())
                {
                    //Set the new sheet id
                    sheetId = sheets.Elements<Sheet>().Max(s => s.SheetId.Value) + 1;
                }
                else
                {
                    sheetId = 1;
                }
 
                // Append a new worksheet and associate it with the workbook.
                var sheet = new Sheet()
                {
                    Id = spreadsheetDocument.WorkbookPart.
                        GetIdOfPart(worksheetPart),
                    SheetId = sheetId,
                    Name = "Sheet" + sheetId
                };
                sheets.Append(sheet);
 
                //Add the header row here.
                var headerRow = new Row();
 
                foreach (DataColumn column in ResultsData.Columns)
                {
                    var cell = new Cell { DataType = CellValues.String, 
                            CellValue = new CellValue(column.ColumnName) };
                    headerRow.AppendChild(cell);
                }
                sheetData.AppendChild(headerRow);
                
                foreach (DataRow row in ResultsData.Rows)
                {
                    var newRow = new Row();
 
                    foreach (DataColumn col in ResultsData.Columns)
                    {
                        var cell = new Cell
                        {
                            DataType = CellValues.String,
                            CellValue = new CellValue(row[col].ToString())
                        };
                        newRow.AppendChild(cell);
                    }
 
                    sheetData.AppendChild(newRow);
                }
 
                workbookpart.Workbook.Save();
 
                // Close the document.
                spreadsheetDocument.Close();
            }
        }
    }
}

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)