Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / productivity / Office / MS-Excel

Read Excel using NPOI

4.59/5 (9 votes)
6 Nov 2015CPOL1 min read 52.3K  
Get Excel cell values as string

Introduction

The C# code snippet demonstrates how to use the open-source NPOI library to read cell values in Microsoft Excel files.

References: NPOI - https://npoi.codeplex.com/

Background

Excel cell can contain values of different types. Also, values may be evaluated from formulae. We need to cater to these cases when reading the cell values.

Using the Code

I write an abstract class ExcelFileReader providing basic functions to read values from an Excel file.

C#
//
// ExcelFileReader
//
public abstract class ExcelFileReader
{
    protected IWorkbook workbook;
    protected DataFormatter dataFormatter;
    protected IFormulaEvaluator formulaEvaluator;

    //
    // Initialize from a stream of Excel file
    //
    protected void InitializeMembers(Stream excelFileStream)
    {
        this.workbook = WorkbookFactory.Create(excelFileStream);
        if (this.workbook != null)
        {
            this.dataFormatter = new DataFormatter(CultureInfo.InvariantCulture);
            this.formulaEvaluator = WorkbookFactory.CreateFormulaEvaluator(this.workbook);
        }
    }

    //
    // Get formatted value as string from the specified cell
    //
    protected string GetFormattedValue(ICell cell)
    {
        string returnValue = string.Empty;
        if (cell != null)
        {
            try
            {
                // Get evaluated and formatted cell value
                returnValue = this.dataFormatter.FormatCellValue(cell, this.formulaEvaluator);
            }
            catch
            {
                // When failed in evaluating the formula, use stored values instead...
                // and set cell value for reference from formulae in other cells...
                if (cell.CellType == CellType.FORMULA)
                {
                    switch (cell.CachedFormulaResultType)
                    {
                        case CellType.STRING:
                            returnValue = cell.StringCellValue;
                            cell.SetCellValue(cell.StringCellValue);
                            break;
                        case CellType.NUMERIC:
                            returnValue = dataFormatter.FormatRawCellContents
                            (cell.NumericCellValue, 0,  cell.CellStyle.GetDataFormatString())
                            cell.SetCellValue(cell.NumericCellValue);
                            break;
                        case CellType.BOOLEAN:
                            returnValue = cell.BooleanCellValue.ToString();
                            cell.SetCellValue(cell.BooleanCellValue);
                            break;
                        default:
                            break;
                    }
                }
            }
        }

        return (returnValue ?? string.Empty).Trim();
    }

    //
    // Get unformatted value as string from the specified cell
    //
    protected string GetUnformattedValue(ICell cell)
    {
        string returnValue = string.Empty;
        if (cell != null)
        {
            try
            {
                // Get evaluated cell value
                returnValue = (cell.CellType == CellType.NUMERIC || 
                (cell.CellType == CellType.FORMULA && 
                cell.CachedFormulaResultType == CellType.NUMERIC)) ?
                    formulaEvaluator.EvaluateInCell(cell).NumericCellValue.ToString() :
                    this.dataFormatter.FormatCellValue(cell, this.formulaEvaluator);
            }
            catch
            {
                // When failed in evaluating the formula, use stored values instead...
                // and set cell value for reference from formulae in other cells...
                if (cell.CellType == CellType.FORMULA)
                {
                    switch (cell.CachedFormulaResultType)
                    {
                        case CellType.STRING:
                            returnValue = cell.StringCellValue;
                            cell.SetCellValue(cell.StringCellValue);
                            break;
                        case CellType.NUMERIC:
                            returnValue = cell.NumericCellValue.ToString();
                            cell.SetCellValue(cell.NumericCellValue);
                            break;
                        case CellType.BOOLEAN:
                            returnValue = cell.BooleanCellValue.ToString();
                            cell.SetCellValue(cell.BooleanCellValue);
                            break;
                        default:
                            break;
                    }
                }
            }
        }

        return (returnValue ?? string.Empty).Trim();
    }
}

You have to implement this abstract ExcelFileReader, which gets the ISheet object from this.workbook and manipulate its IRow object to get the cell values.

Points of Interest

The try-catch block in the function GetFormattedValue() handles the failure in evaluating the cell formula (says with external reference) by taking the cached value (if any) instead. The cell value is then set to the cached value by calling the SetCellValue() function for any formulae in other cells which reference to this cell. For numeric value, formatted value will be obtained by calling DataFormatter.FormatRawCellContents() function.

Similarily, the function GetUnformattedValue() gets cell value as string without any format applied. It identities numeric cell, which format may be applied, and gets the unformatted value from the ICell.NumericCellValue property.

The CellType enum does not have any item like DATE for date/time value, which are stored as number in Excel. So, there is no direct way to identity the cell is storing a date/time value. I think one way is to see its format by refering to ICell.CellStyle.DataFormat property or ICell.CellStyle.GetDataFormatString() function.

History

  • 6th November, 2015 - First publication

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)