Question

I am writing an application to open an Excel sheet and read it

MyApp = new Excel.Application();
MyBook = MyApp.Workbooks.Open(filename);
MySheet = (Excel.Worksheet)MyBook.Sheets[1]; // Explict cast is not required here
lastRow = MySheet.Cells.SpecialCells(Excel.XlCellType.xlCellTypeLastCell).Row;
MyApp.Visible = false;

It takes about 6-7 seconds for this to take place, is this normal with interop Excel?

Also is there a quicker way to Read an Excel than this?

string[] xx = new string[lastRow];
for (int index = 1; index <= lastRow; index++)
{
   int maxCol = endCol - startCol;
   for (int j = 1; j <= maxCol; j++)
   {
      try
      {
         xx[index - 1] += (MySheet.Cells[index, j] as Excel.Range).Value2.ToString();
      }
      catch
      {    
      }

      if (j != maxCol) xx[index - 1] += "|";
   }
}
MyApp.Quit();
System.Runtime.InteropServices.Marshal.ReleaseComObject(MySheet);
System.Runtime.InteropServices.Marshal.ReleaseComObject(MyBook);
System.Runtime.InteropServices.Marshal.ReleaseComObject(MyApp);
Was it helpful?

Solution 3

This answer is only about the second part of your question. Your are using lots of ranges there which is not as intended and indeed very slow.

First read the complete range and then iterate over the result like so:

var xx[,] = (MySheet.Cells["A1", "XX100"] as Excel.Range).Value2;
for (int i=0;i<xx.getLength(0);i++)
{
    for (int j=0;j<xx.getLength(1);j++)
    {
         Console.WriteLine(xx[i,j].toString());
    }
}

This will be much faster!

OTHER TIPS

Appending to the answer of @RvdK - yes COM interop is slow.

Why is it slow?

It is due to the fact how it works. Every call made from .NET must be marshaled to local COM proxy from there it must be marshaled from one process (your app) to the COM server (Excel) (through IPC inside Windows kernel) then it gets translated (dispatched) from the server's local proxy into a native code where arguments get marshaled from OLE Automation compatible types into native types, their validity checked and the function is performed. Result of the function travels back approximately same way through several layers between 2 different processes.

So each and every command is quite expensive to execute, the more of them you do the slower the whole process is. You can find lots of documentation all around the web as COM is old and well working standard (somehow dying with Visual Basic 6).

One example of such article is here: http://www.codeproject.com/Articles/990/Understanding-Classic-COM-Interoperability-With-NE

Is there a quicker way to read?

  1. ClosedXML can both read and write Excel xlsx files (even formulas, formatting and stuff) using Microsoft's OpenXml SDK, see here: https://closedxml.codeplex.com/wikipage?title=Finding%20and%20extracting%20the%20data&referringTitle=Documentation

  2. Excel data reader claims to be able to read both legacy and new Excel data files, I did not try it myself, take a look here: https://exceldatareader.codeplex.com/

  3. another way to read data faster is to use Excel automation to translate sheet into a data file that you can understand easily and batch process without the interop layer (e.g. XML,CSV). This answer shows how to do it

Short answer: correct, interop is slow. (had the same problem, taking couple of seconds to read 300 lines...

Use a library for this:

You can use this free library, xls & xlsx supported,

Workbook wb = new Workbook();
wb.LoadFromFile(ofd.FileName);

https://freenetexcel.codeplex.com/

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top