Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / productivity / Office / MS-Excel

Extract .NET Object from Excel WorkSheet

0.00/5 (No votes)
22 Mar 2018CPOL4 min read 16.5K   280  
Describes a generic approach which extracts object list of a given type from spreadsheet

Introduction

This article demonstrates a quick way to get .NET object list rather than .NET Dataset/Datatable from Excel worksheet or Xml SpreadSheet.This approach handdle Enum type property of object and the situation where workSheet data not start from 1st row 1st column. Sometimes, users need a list of strong type objects in their application from Excel worksheet files and this article will help to do that in a simple and quick way.

Background

Few months ago, I got some Excel files(Save as Xml SpreadSheet 2003 format) to extract data from them into .NET object list. Data extraction was based on worksheet's header row or data column position/index. In my case, to serialize different objects from different XML need to code each time. To get rid of that problem, the following generic approach will help us.

Requirements: To run demo code :

  • .net framework 4.5 or above
  •  visual studio 2015 or above is needed.
  • Tested on (Microsoft Excel 2007) *Not mendatory

Using the Code

For example, let's have a Class named Person:

C#
public class Person {
    public int ID { get; set; }
    public string Name { get; set; }
    public Nullable<System.DateTime> DateOfBirth { get; set; }
    public string District {get; set;}
}

And Excel file(save as MS XML format 2003) look like this:

ExcelFile.xml

XML
 <?xml version="1.0"?>

<Workbook >
 <Worksheet  Name="Sheet1">
  <Table>
   <Row>
    <Cell><Data  Type="String">ID</Data></Cell>
    <Cell><Data  Type="String">Name</Data></Cell>
    <Cell><Data  Type="String">DofB</Data></Cell>
    <Cell></Cell>
   </Row>
   <Row>
    <Cell><Data  Type="Number">1</Data></Cell>
    <Cell><Data  Type="String">A</Data></Cell>
    <Cell><Data  Type="DateTime">2010-09-28T00:00:00.000</Data></Cell>
    <Cell><Data  Type="String">District Z</Data></Cell>
   </Row>
   <Row>
    <Cell><Data  Type="Number">2</Data></Cell>
    <Cell><Data  Type="String">B</Data></Cell>
    <Cell><Data  Type="DateTime">2011-09-28T00:00:00.000</Data></Cell>
    <Cell><Data  Type="String">District Y</Data></Cell>
   </Row>
  </Table>
 </Worksheet>
</Workbook>

From the input XML file, we see some row of data. 1st row (index = 0) presents the header row and rest of them are Person's data row. To get Person List from the XML file:

Step 1

Add downloaded ExcelToObjectConvertor project  to your solution or add ExcelToObjectConvertor.dll to your project . Then include its namespace in the file where you want to use it :

C#
using ExcelToObjectConvertor;

Call ExcelXMLConvertor.GetWortSheetsInfo method. This will give the all workSheet List. This list contain work Sheet names, workSheet Id's and other information.

C#
string filePath = @"your excel file path"; 
List<WorkSheetInfo> workSheetList = ExcelXMLConvertor.GetWortSheetsInfo(filePath); 
or 
//for the file upload from fileUpload or OpenFileDialog
List<WorkSheetInfo> workSheetList = ExcelXMLConvertor.GetWortSheetsInfo(PostedFile.InputStream); 

From workSheetList find a worksheet by name or index or loop through all of them to get Data from them.

C#
// Get data by workSheet name
WorkSheetInfo workSheet = workSheetList.Find(wh => wh.WorkSheetName == @"Sheet1");
//Update properties or Map object property to Column Name/Number of workSheet. 
//get data from workSheet.

Or

// Get data from all workSheet
foreach (WorkSheetInfo workSheet in workSheetList)
{
    //Update properties or Map object property to Column Name/Number of workSheet.
    //get data from workSheet.
}              

Step 2

Default header row index is zero (HeaderRowIndex=0). You can change Header Row Index by calling :

C#
ExcelXMLConvertor.UpdateHeaderRowIndexProperty(workSheet.WorkSheetId, newheaderRowIndex);

By default, it will map Header text from Header row to object property name. For example, Person.Name map to 1st <Row> 2nd <cell> and Person.ID map to 1st <Row> 1st <cell>:

XML
<Row>
  <Cell><Data Type="String">ID</Data></Cell>    //-->1st cell , cell position or column index = 0 
  <cell><Data  Type="String">Name</Data></cell> //-->2nd cell , cell position or column index = 1 
  ...
</Row>

All 1st cell data will map to Person.ID and 2nd cell data will map to Person.Name

Special Case

In our example, Person.DateOfBirth can't map to 3rd <Cell> of <Row> nodes of XML file:

XML
<Row>
  ... 
  <cell><Data  Type="String">DofB</Data></cell> //-->3rd cell , cell position or column index = 2 
  ...
</Row>

To map Person.DateOfBirth to 3rd <cell> node with header text DofB:

C#
Person person = new Person();
ExcelXMLConvertor.MapPropertyNameToHeaderName(workSheet.WorkSheetId, 
                                             nameof(person.DateOfBirth), @"DofB");

The last <cell> node of all <row>s have no header text, so it can't map to any class property. But you can map this <cell> node with cell position or column index, so that all last <cell> nodes will map to Person.District. To do that, map last <cell> (index = 3) to Person.District :

C#
uint index = 3; 
ExcelXMLConvertor.MapPropertyNameToColumnNumber(workSheet.WorkSheetId, nameof(person.District),index);

*If there is no header row then you have to map all property with column number.

Step 3

Finally, call method ExcelXMLConvertor.GetObjectList<T>(WorkSheetId)

C#
// Get data by workSheet name
List<Person> personList = ExcelXMLConvertor.GetObjectList<Person>(workSheet.WorkSheetId);

Or

// Get data from all workSheet if all workSheets have same type's data.
List<Person> personList = new List<Person>();;
foreach (WorkSheetInfo workSheet in workSheetList)
{
    personList.AddRange(ExcelXMLConvertor.GetObjectList<Person>(workSheet.WorkSheetId));
} 

After execution of the code, PersonList will have two person objects:

C#
PersonList --> {Person(1, A, 2010-09-28, District Z), Person(2, B, 2011-09-28, District Y)}  

Full code look like below:

C#
using System;
using System.Xml;
using ExcelToObjectConvertor;
using System.Collections.Generic;
using System.IO;
using Excel = Microsoft.Office.Interop.Excel;

class Program
{
  static void Main(string[] args)
  {
     string filePath = Path.GetFullPath(@"Your File path");
     
     List<WorkSheetInfo> workSheetList= ExcelXMLConvertor.GetWortSheetsInfo(filePath);
     
     WorkSheetInfo workSheet = workSheetList.Find(wh => wh.WorkSheetName == @"Sheet1");
     Person person = new Person();
     ExcelXMLConvertor.MapPropertyNameToHeaderName(workSheet.WorkSheetId, 
                                         nameof(person.DateOfBirth), @"DofB");
     uint index = 3; 
     ExcelXMLConvertor.MapPropertyNameToColumnNumber(workSheet.WorkSheetId,
                                         nameof(person.District),index);
     //Result:
     List<Person> personList = ExcelXMLConvertor.GetObjectList<Person>(workSheet.WorkSheetId);
  }
}

Using the sample Web application

In my demo project I have gived a sample Web(aspx) and Console project to give an idea how ExcelToObjectConvertor works. Following images will show and explain how it works. Run the demo Web project and you will find page look like below image. 

Browse your excel file then upload it by click Upload Excel File button. If you do not need any property mapping then just select a work sheet name from Work sheet DropDownList, select the Object name from Object DropDownList (Which are not showing in the above image) which you want to convert and then click Get Object List button. You will get the converted object list is showing at bottom of the page.

Header Row

By default 1st row detect as header row but user can select a row from first 5 row of the spreadsheet as header row from First5Row grid. If there is no header row then unchecked the HasHeader check box and map all property you want to loaded with data by column number. To do that, select Column Number RadioButton. See the below image:

For Map Property To Header Name or Cell Index

Select a property from Property DropDownList, select a header name from Header Name DropDownList then click AddToMappingPair button. One MappingPair will add to Mapping Pair list. You can also map property with column number/index by select Column Number RadioButton, insert a index(0 to 26), select appropiate property from Property DropDownList, click AddToMappingPair button.

 

After Map all property click Get Object List button. See the sample output below:

Hope it will help some one  and minimize work load.

History

  • 22nd March, 2018: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)