Question

I found NPOI is very good to write Excel files with C#.

But I want to open, read and modify Excel files in C#.

How can I do this?

Was it helpful?

Solution

Simple read example below:

using NPOI.HSSF.UserModel;
using NPOI.SS.UserModel;

//.....

private void button1_Click(object sender, EventArgs e)
{
    HSSFWorkbook hssfwb;
    using (FileStream file = new FileStream(@"c:\test.xls", FileMode.Open, FileAccess.Read))
    {
        hssfwb= new HSSFWorkbook(file);
    }

    ISheet sheet = hssfwb.GetSheet("Arkusz1");
    for (int row = 0; row <= sheet.LastRowNum; row++)
    {
        if (sheet.GetRow(row) != null) //null is when the row only contains empty cells 
        {
            MessageBox.Show(string.Format("Row {0} = {1}", row, sheet.GetRow(row).GetCell(0).StringCellValue));
        }
    }
}  

By the way: on NPOI website here in Download section there is example package - a pack of C# examples. Try it, if you haven't yet. :)

OTHER TIPS

It might be helpful to rely on the Workbook factory to instantiate the workbook object since the factory method will do the detection of xls or xlsx for you. Reference: http://apache-poi.1045710.n5.nabble.com/How-to-check-for-valid-excel-files-using-POI-without-checking-the-file-extension-td2341055.html

IWorkbook workbook = WorkbookFactory.Create(inputStream);

If you're not sure of the Sheet's name but you are sure of the index (0 based), you can grab the sheet like this:

ISheet sheet = workbook.GetSheetAt(sheetIndex);

You can then iterate through the rows using code supplied by the accepted answer from mj82

I find NPOI very usefull for working with Excel Files, here is my implementation (Comments are in Spanish, sorry for that):

This Method Opens an Excel (both xls or xlsx) file and converts it into a DataTable.

        /// <summary>Abre un archivo de Excel (xls o xlsx) y lo convierte en un DataTable.
    /// LA PRIMERA FILA DEBE CONTENER LOS NOMBRES DE LOS CAMPOS.</summary>
    /// <param name="pRutaArchivo">Ruta completa del archivo a abrir.</param>
    /// <param name="pHojaIndex">Número (basado en cero) de la hoja que se desea abrir. 0 es la primera hoja.</param>
    private DataTable Excel_To_DataTable(string pRutaArchivo, int pHojaIndex)
    {
        // --------------------------------- //
        /* REFERENCIAS:
         * NPOI.dll
         * NPOI.OOXML.dll
         * NPOI.OpenXml4Net.dll */
        // --------------------------------- //
        /* USING:
         * using NPOI.SS.UserModel;
         * using NPOI.HSSF.UserModel;
         * using NPOI.XSSF.UserModel; */
        // AUTOR: Ing. Jhollman Chacon R. 2015
        // --------------------------------- //
        DataTable Tabla = null;
        try
        {
            if (System.IO.File.Exists(pRutaArchivo))
            {

                IWorkbook workbook = null;  //IWorkbook determina si es xls o xlsx              
                ISheet worksheet = null;
                string first_sheet_name = "";

                using (FileStream FS = new FileStream(pRutaArchivo, FileMode.Open, FileAccess.Read))
                {
                    workbook = WorkbookFactory.Create(FS);          //Abre tanto XLS como XLSX
                    worksheet = workbook.GetSheetAt(pHojaIndex);    //Obtener Hoja por indice
                    first_sheet_name = worksheet.SheetName;         //Obtener el nombre de la Hoja

                    Tabla = new DataTable(first_sheet_name);
                    Tabla.Rows.Clear();
                    Tabla.Columns.Clear();

                    // Leer Fila por fila desde la primera
                    for (int rowIndex = 0; rowIndex <= worksheet.LastRowNum; rowIndex++)
                    {
                        DataRow NewReg = null;
                        IRow row = worksheet.GetRow(rowIndex);
                        IRow row2 = null;
                        IRow row3 = null;

                        if (rowIndex == 0)
                        {
                            row2 = worksheet.GetRow(rowIndex + 1); //Si es la Primera fila, obtengo tambien la segunda para saber el tipo de datos
                            row3 = worksheet.GetRow(rowIndex + 2); //Y la tercera tambien por las dudas
                        }

                        if (row != null) //null is when the row only contains empty cells 
                        {
                            if (rowIndex > 0) NewReg = Tabla.NewRow();

                            int colIndex = 0;
                            //Leer cada Columna de la fila
                            foreach (ICell cell in row.Cells)
                            {
                                object valorCell = null;
                                string cellType = "";
                                string[] cellType2 = new string[2];

                                if (rowIndex == 0) //Asumo que la primera fila contiene los titlos:
                                {
                                    for (int i = 0; i < 2; i++)
                                    {
                                        ICell cell2 = null;
                                        if (i == 0) { cell2 = row2.GetCell(cell.ColumnIndex); }
                                        else { cell2 = row3.GetCell(cell.ColumnIndex); }

                                        if (cell2 != null)
                                        {
                                            switch (cell2.CellType)
                                            {
                                                case CellType.Blank: break;
                                                case CellType.Boolean: cellType2[i] = "System.Boolean"; break;
                                                case CellType.String: cellType2[i] = "System.String"; break;
                                                case CellType.Numeric:
                                                    if (HSSFDateUtil.IsCellDateFormatted(cell2)) { cellType2[i] = "System.DateTime"; }
                                                    else
                                                    {
                                                        cellType2[i] = "System.Double";  //valorCell = cell2.NumericCellValue;
                                                    }
                                                    break;

                                                case CellType.Formula:                                                      
                                                    bool continuar = true;
                                                    switch (cell2.CachedFormulaResultType)
                                                    {
                                                        case CellType.Boolean: cellType2[i] = "System.Boolean"; break;
                                                        case CellType.String: cellType2[i] = "System.String"; break;
                                                        case CellType.Numeric:
                                                            if (HSSFDateUtil.IsCellDateFormatted(cell2)) { cellType2[i] = "System.DateTime"; }
                                                            else
                                                            {
                                                                try
                                                                {
                                                                    //DETERMINAR SI ES BOOLEANO
                                                                    if (cell2.CellFormula == "TRUE()") { cellType2[i] = "System.Boolean"; continuar = false; }
                                                                    if (continuar && cell2.CellFormula == "FALSE()") { cellType2[i] = "System.Boolean"; continuar = false; }
                                                                    if (continuar) { cellType2[i] = "System.Double"; continuar = false; }
                                                                }
                                                                catch { }
                                                            } break;
                                                    }
                                                    break;
                                                default:
                                                    cellType2[i] = "System.String"; break;
                                            }
                                        }
                                    }

                                    //Resolver las diferencias de Tipos
                                    if (cellType2[0] == cellType2[1]) { cellType = cellType2[0]; }
                                    else
                                    {
                                        if (cellType2[0] == null) cellType = cellType2[1];
                                        if (cellType2[1] == null) cellType = cellType2[0];
                                        if (cellType == "") cellType = "System.String";
                                    }

                                    //Obtener el nombre de la Columna
                                    string colName = "Column_{0}";
                                    try { colName = cell.StringCellValue; }
                                    catch { colName = string.Format(colName, colIndex); }

                                    //Verificar que NO se repita el Nombre de la Columna
                                    foreach (DataColumn col in Tabla.Columns)
                                    {
                                        if (col.ColumnName == colName) colName = string.Format("{0}_{1}", colName, colIndex);
                                    }

                                    //Agregar el campos de la tabla:
                                    DataColumn codigo = new DataColumn(colName, System.Type.GetType(cellType));
                                    Tabla.Columns.Add(codigo); colIndex++;
                                }
                                else
                                {
                                    //Las demas filas son registros:
                                    switch (cell.CellType)
                                    {
                                        case CellType.Blank: valorCell = DBNull.Value; break;
                                        case CellType.Boolean: valorCell = cell.BooleanCellValue; break;
                                        case CellType.String: valorCell = cell.StringCellValue; break;
                                        case CellType.Numeric:
                                            if (HSSFDateUtil.IsCellDateFormatted(cell)) { valorCell = cell.DateCellValue; }
                                            else { valorCell = cell.NumericCellValue; } break;
                                        case CellType.Formula:
                                            switch (cell.CachedFormulaResultType)
                                            {
                                                case CellType.Blank: valorCell = DBNull.Value; break;
                                                case CellType.String: valorCell = cell.StringCellValue; break;
                                                case CellType.Boolean: valorCell = cell.BooleanCellValue; break;
                                                case CellType.Numeric:
                                                    if (HSSFDateUtil.IsCellDateFormatted(cell)) { valorCell = cell.DateCellValue; }
                                                    else { valorCell = cell.NumericCellValue; }
                                                    break;
                                            }
                                            break;
                                        default: valorCell = cell.StringCellValue; break;
                                    }
                                    //Agregar el nuevo Registro
                                    if (cell.ColumnIndex <= Tabla.Columns.Count - 1) NewReg[cell.ColumnIndex] = valorCell;
                                }
                            }
                        }
                        if (rowIndex > 0) Tabla.Rows.Add(NewReg);
                    }
                    Tabla.AcceptChanges();
                }
            }
            else
            {
                throw new Exception("ERROR 404: El archivo especificado NO existe.");
            }
        }
        catch (Exception ex)
        {
            throw ex;
        }
        return Tabla;
    }

This Second method does the oposite, saves a DataTable into an Excel File, yeah it can either be xls or the new xlsx, your choise!

        /// <summary>Convierte un DataTable en un archivo de Excel (xls o Xlsx) y lo guarda en disco.</summary>
    /// <param name="pDatos">Datos de la Tabla a guardar. Usa el nombre de la tabla como nombre de la Hoja</param>
    /// <param name="pFilePath">Ruta del archivo donde se guarda.</param>
    private void DataTable_To_Excel(DataTable pDatos, string pFilePath)
    {
        try
        {
            if (pDatos != null && pDatos.Rows.Count > 0)
            {
                IWorkbook workbook = null;
                ISheet worksheet = null;

                using (FileStream stream = new FileStream(pFilePath, FileMode.Create, FileAccess.ReadWrite))
                {
                    string Ext = System.IO.Path.GetExtension(pFilePath); //<-Extension del archivo
                    switch (Ext.ToLower())
                    {
                        case ".xls":
                            HSSFWorkbook workbookH = new HSSFWorkbook();
                            NPOI.HPSF.DocumentSummaryInformation dsi = NPOI.HPSF.PropertySetFactory.CreateDocumentSummaryInformation();
                            dsi.Company = "Cutcsa"; dsi.Manager = "Departamento Informatico";
                            workbookH.DocumentSummaryInformation = dsi;
                            workbook = workbookH;
                            break;

                        case ".xlsx": workbook = new XSSFWorkbook(); break;
                    }

                    worksheet = workbook.CreateSheet(pDatos.TableName); //<-Usa el nombre de la tabla como nombre de la Hoja

                    //CREAR EN LA PRIMERA FILA LOS TITULOS DE LAS COLUMNAS
                    int iRow = 0;
                    if (pDatos.Columns.Count > 0)
                    {
                        int iCol = 0;
                        IRow fila = worksheet.CreateRow(iRow);
                        foreach (DataColumn columna in pDatos.Columns)
                        {
                            ICell cell = fila.CreateCell(iCol, CellType.String);
                            cell.SetCellValue(columna.ColumnName);
                            iCol++;
                        }
                        iRow++;
                    }

                    //FORMATOS PARA CIERTOS TIPOS DE DATOS
                    ICellStyle _doubleCellStyle = workbook.CreateCellStyle();
                    _doubleCellStyle.DataFormat = workbook.CreateDataFormat().GetFormat("#,##0.###");

                    ICellStyle _intCellStyle = workbook.CreateCellStyle();
                    _intCellStyle.DataFormat = workbook.CreateDataFormat().GetFormat("#,##0");

                    ICellStyle _boolCellStyle = workbook.CreateCellStyle();
                    _boolCellStyle.DataFormat = workbook.CreateDataFormat().GetFormat("BOOLEAN");

                    ICellStyle _dateCellStyle = workbook.CreateCellStyle();
                    _dateCellStyle.DataFormat = workbook.CreateDataFormat().GetFormat("dd-MM-yyyy");

                    ICellStyle _dateTimeCellStyle = workbook.CreateCellStyle();
                    _dateTimeCellStyle.DataFormat = workbook.CreateDataFormat().GetFormat("dd-MM-yyyy HH:mm:ss");

                    //AHORA CREAR UNA FILA POR CADA REGISTRO DE LA TABLA
                    foreach (DataRow row in pDatos.Rows)
                    {
                        IRow fila = worksheet.CreateRow(iRow);
                        int iCol = 0;
                        foreach (DataColumn column in pDatos.Columns)
                        {
                            ICell cell = null; //<-Representa la celda actual                               
                            object cellValue = row[iCol]; //<- El valor actual de la celda

                            switch (column.DataType.ToString())
                            {
                                case "System.Boolean":
                                    if (cellValue != DBNull.Value)
                                    {
                                        cell = fila.CreateCell(iCol, CellType.Boolean);

                                        if (Convert.ToBoolean(cellValue)) { cell.SetCellFormula("TRUE()"); }
                                        else { cell.SetCellFormula("FALSE()"); }

                                        cell.CellStyle = _boolCellStyle;
                                    }
                                    break;

                                case "System.String":
                                    if (cellValue != DBNull.Value)
                                    {
                                        cell = fila.CreateCell(iCol, CellType.String);
                                        cell.SetCellValue(Convert.ToString(cellValue));
                                    }
                                    break;

                                case "System.Int32":
                                    if (cellValue != DBNull.Value)
                                    {
                                        cell = fila.CreateCell(iCol, CellType.Numeric);
                                        cell.SetCellValue(Convert.ToInt32(cellValue));
                                        cell.CellStyle = _intCellStyle;
                                    }
                                    break;
                                case "System.Int64":
                                    if (cellValue != DBNull.Value)
                                    {
                                        cell = fila.CreateCell(iCol, CellType.Numeric);
                                        cell.SetCellValue(Convert.ToInt64(cellValue));
                                        cell.CellStyle = _intCellStyle;
                                    }
                                    break;
                                case "System.Decimal":
                                    if (cellValue != DBNull.Value)
                                    {
                                        cell = fila.CreateCell(iCol, CellType.Numeric);
                                        cell.SetCellValue(Convert.ToDouble(cellValue));
                                        cell.CellStyle = _doubleCellStyle;
                                    }
                                    break;
                                case "System.Double":
                                    if (cellValue != DBNull.Value)
                                    {
                                        cell = fila.CreateCell(iCol, CellType.Numeric);
                                        cell.SetCellValue(Convert.ToDouble(cellValue));
                                        cell.CellStyle = _doubleCellStyle;
                                    }
                                    break;

                                case "System.DateTime":
                                    if (cellValue != DBNull.Value)
                                    {
                                        cell = fila.CreateCell(iCol, CellType.Numeric);
                                        cell.SetCellValue(Convert.ToDateTime(cellValue));

                                        //Si No tiene valor de Hora, usar formato dd-MM-yyyy
                                        DateTime cDate = Convert.ToDateTime(cellValue);
                                        if (cDate != null && cDate.Hour > 0) { cell.CellStyle = _dateTimeCellStyle; }
                                        else { cell.CellStyle = _dateCellStyle; }
                                    }
                                    break;
                                default:
                                    break;
                            }
                            iCol++;
                        }
                        iRow++;
                    }

                    workbook.Write(stream);
                    stream.Close();
                }
            }
        }
        catch (Exception ex)
        {
            throw ex;
        }
    }

With this 2 methods you can Open an Excel file, load it into a DataTable, do your modifications and save it back into an Excel file.

Hope you guys find this usefull.

private DataTable GetDataTableFromExcel(String Path)
    {
        XSSFWorkbook wb;
        XSSFSheet sh;
        String Sheet_name;

        using (var fs = new FileStream(Path, FileMode.Open, FileAccess.Read))
        {
            wb = new XSSFWorkbook(fs);

               Sheet_name= wb.GetSheetAt(0).SheetName;  //get first sheet name
        }
        DataTable DT = new DataTable();
        DT.Rows.Clear();
        DT.Columns.Clear();

        // get sheet
        sh = (XSSFSheet)wb.GetSheet(Sheet_name);

        int i = 0;
        while (sh.GetRow(i) != null)
        {
            // add neccessary columns
            if (DT.Columns.Count < sh.GetRow(i).Cells.Count)
            {
                for (int j = 0; j < sh.GetRow(i).Cells.Count; j++)
                {
                    DT.Columns.Add("", typeof(string));
                }
            }

            // add row
            DT.Rows.Add();

            // write row value
            for (int j = 0; j < sh.GetRow(i).Cells.Count; j++)
            {
                var cell = sh.GetRow(i).GetCell(j);

                if (cell != null)
                {
                    // TODO: you can add more cell types capatibility, e. g. formula
                    switch (cell.CellType)
                    {
                        case NPOI.SS.UserModel.CellType.Numeric:
                            DT.Rows[i][j] = sh.GetRow(i).GetCell(j).NumericCellValue;
                            //dataGridView1[j, i].Value = sh.GetRow(i).GetCell(j).NumericCellValue;

                            break;
                        case NPOI.SS.UserModel.CellType.String:
                            DT.Rows[i][j] = sh.GetRow(i).GetCell(j).StringCellValue;

                            break;
                    }
                }
            }

            i++;
        }

        return DT;
    }
private static ISheet GetFileStream(string fullFilePath)
    {
        var fileExtension = Path.GetExtension(fullFilePath);
        string sheetName;
        ISheet sheet = null;
        switch (fileExtension)
        {
            case ".xlsx":
                using (var fs = new FileStream(fullFilePath, FileMode.Open, FileAccess.Read))
                {
                    var wb = new XSSFWorkbook(fs);
                    sheetName = wb.GetSheetAt(0).SheetName;
                    sheet = (XSSFSheet) wb.GetSheet(sheetName);
                }
                break;
            case ".xls":
                using (var fs = new FileStream(fullFilePath, FileMode.Open, FileAccess.Read))
                {
                    var wb = new HSSFWorkbook(fs);
                    sheetName = wb.GetSheetAt(0).SheetName;
                    sheet = (HSSFSheet) wb.GetSheet(sheetName);
                }
                break;
        }
        return sheet;
    }

    private static DataTable GetRequestsDataFromExcel(string fullFilePath)
    {
        try
        {
            var sh = GetFileStream(fullFilePath);
            var dtExcelTable = new DataTable();
            dtExcelTable.Rows.Clear();
            dtExcelTable.Columns.Clear();
            var headerRow = sh.GetRow(0);
            int colCount = headerRow.LastCellNum;
            for (var c = 0; c < colCount; c++)
                dtExcelTable.Columns.Add(headerRow.GetCell(c).ToString());
            var i = 1;
            var currentRow = sh.GetRow(i);
            while (currentRow != null)
            {
                var dr = dtExcelTable.NewRow();
                for (var j = 0; j < currentRow.Cells.Count; j++)
                {
                    var cell = currentRow.GetCell(j);

                    if (cell != null)
                        switch (cell.CellType)
                        {
                            case CellType.Numeric:
                                dr[j] = DateUtil.IsCellDateFormatted(cell)
                                    ? cell.DateCellValue.ToString(CultureInfo.InvariantCulture)
                                    : cell.NumericCellValue.ToString(CultureInfo.InvariantCulture);
                                break;
                            case CellType.String:
                                dr[j] = cell.StringCellValue;
                                break;
                            case CellType.Blank:
                                dr[j] = string.Empty;
                                break;
                        }
                }
                dtExcelTable.Rows.Add(dr);
                i++;
                currentRow = sh.GetRow(i);
            }
            return dtExcelTable;
        }
        catch (Exception e)
        {
            throw;
        }
    }

As Janoulle pointed out, you don't need to detect which extension it is if you use the WorkbookFactory, it will do it for you. I recently had to implement a solution using NPOI to read Excel files and import email addresses into a sql database. My main problem was that I was probably going to receive about 12 different Excel layouts from different customers so I needed something that could be changed quickly without much code. I ended up using Npoi.Mapper which is an awesome tool! Highly recommended!

Here is my complete solution:

using System.IO;
using System.Linq;
using Npoi.Mapper;
using Npoi.Mapper.Attributes;
using NPOI.SS.UserModel;

namespace JobCustomerImport.Processors
{
    public class ExcelEmailProcessor
    {
        private UserManagementServiceContext DataContext { get; }

        public ExcelEmailProcessor(int customerNumber)
        {
            DataContext = new UserManagementServiceContext();
        }

        public void Execute(string localPath, int sheetIndex)
        {
            IWorkbook workbook;
            using (FileStream file = new FileStream(localPath, FileMode.Open, FileAccess.Read))
            {
                workbook = WorkbookFactory.Create(file);
            }

            var importer = new Mapper(workbook);
            var items = importer.Take<MurphyExcelFormat>(sheetIndex);
            foreach(var item in items)
            {
                var row = item.Value;
                if (string.IsNullOrEmpty(row.EmailAddress))
                    continue;

                UpdateUser(row);
            }

            DataContext.SaveChanges();
        }

        private void UpdateUser(MurphyExcelFormat row)
        {
            //LOGIC HERE TO UPDATE A USER IN DATABASE...
        }

        private class MurphyExcelFormat
        {
            [Column("District")]
            public int District { get; set; }

            [Column("DM")]
            public string FullName { get; set; }

            [Column("Email Address")]
            public string EmailAddress { get; set; }

            [Column(3)]
            public string Username { get; set; }

            public string FirstName
            {
                get
                {
                    return Username.Split('.')[0];
                }
            }

            public string LastName
            {
                get
                {
                    return Username.Split('.')[1];
                }
            }
        }
    }
}

I am so happy with NPOI + Npoi.Mapper (from Donny Tian) as an Excel import solution that I wrote a blog post about it, going in to more detail about this code above. You can read it here if you wish: Easiest way to import excel files. The best thing about this solution is that it runs perfectly in a serverless azure/cloud environment which I couldn't get with other Excel tools/libraries.

Since you've asked to read and modify the xls file I have changed @mj82's answer to correspond your needs.

HSSFWorkbook does not have Save method, but it does have Write to a stream.

static void Main(string[] args)
        {
            string filepath = @"C:\test.xls";

            HSSFWorkbook hssfwb;

            using (FileStream file = new FileStream(filepath, FileMode.Open, FileAccess.Read))
            {
                hssfwb = new HSSFWorkbook(file);
            }

            ISheet sheet = hssfwb.GetSheetAt(0);

            for (int row = 0; row <= sheet.LastRowNum; row++)
            {
                if (sheet.GetRow(row) != null) //null is when the row only contains empty cells 
                {
                    // Set new cell value
                    sheet.GetRow(row).GetCell(0).SetCellValue("foo");
                    Console.WriteLine("Row {0} = {1}", row, sheet.GetRow(row).GetCell(0).StringCellValue);
                }
            }

            // Save the file
            using (FileStream file = new FileStream(filepath, FileMode.Open, FileAccess.Write))
            {
                hssfwb.Write(file);
            }

            Console.ReadLine();
        }
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top