Question

I'm trying to write an import function for getting data out of an excel file. How I currently do it is as follows:

Private Sub ReadExcel(ByVal childform As PhoneDiag.frmData, ByVal FileName As String)

        Dim xlApp As Excel.Application
        Dim xlWorkBook As Excel.Workbook
        Dim xlWorkSheet As Excel.Worksheet

        xlApp = New Excel.ApplicationClass
        xlWorkBook = xlApp.Workbooks.Open(FileName)
        xlWorkSheet = xlWorkBook.Worksheets(1)
        Dim columnrange = xlWorkSheet.Columns
        Dim therange = xlWorkSheet.UsedRange

        ''Add rows by column
        For rCnt = 2 To therange.Rows.Count

            Dim rowArray(therange.Columns.Count) As String

            For cCnt = 1 To therange.Columns.Count

                Dim Obj = CType(therange.Cells(rCnt, cCnt), Excel.Range)
                Dim celltext As String
                celltext = Obj.Value.ToString
                rowArray((cCnt - 1)) = celltext

            Next

            childform.datagridSample.Rows.Add(rowArray)

        Next

        '' make sure we close the excel.exe service after use
        xlWorkBook.Close()
        xlApp.Quit()
        System.Runtime.InteropServices.Marshal.ReleaseComObject(xlWorkBook)
        System.Runtime.InteropServices.Marshal.ReleaseComObject(xlWorkSheet)
        System.Runtime.InteropServices.Marshal.ReleaseComObject(xlApp)

    End Sub

The problem is, of course, is that it runs horribly. From what I can gather, it more than likely comes down to this line:

Dim Obj = CType(therange.Cells(rCnt, cCnt), Excel.Range)

All I need is the text from the cells, not to create an object for each cell (and then not send them to garbage collection). Is there an easier way of just getting the text?

Ideally, if I can get a method for getting the text values of the cell, I'd like to get the multiple rowArray()'s added to a master array and update the program's values later.

If you see any other performance tips, let me know. It would be much appreciated. =b

EDIT: I also realize that I have two options if I were to create a master array, say mArr, to hold all the data. Would it be better performance-wise to have mArr to be large and the sub arrays small, or mArr to be small and the sub arrays to hold more of the information?

I ask because the files that will be imported will have more rows than columns, so I was wondering if there was any "set" way of doing it.

Was it helpful?

Solution

Reading cell content individually is a huge performance killer. My advice is to read first the entire range into an array of objects, and then retrieve the data from that array. I don't write in VB.NET, but in C# the code looks something like this:

Excel.Range firstCell = excelWorksheet.get_Range("A1", Type.Missing);
Excel.Range lastCell = excelWorksheet.Cells.SpecialCells(Excel.XlCellType.xlCellTypeLastCell, Type.Missing);
object[,] cellValues;
Excel.Range worksheetCells = excelWorksheet.get_Range(firstCell, lastCell);
cellValues = worksheetCells.Value2 as object[,];

This example reads the entire contents of a sheet into cellValues (careful about nulls). The same advice applies to writing to the sheet - do it all at once, with one array.
If you are interested, I have a longer post on this there
Oh and BTW, replace

xlApp = New Excel.ApplicationClass

by

xlApp = New Excel.Application

OTHER TIPS

here's a C# version (but you get the gist) that will get data into a datatable...

con = new OleDbConnection(@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source="+SpreadsheetLocation+";Extended Properties=\"Excel 8.0;IMEX=1;HDR=NO\"");
OleDbDataAdapter da = new OleDbDataAdapter("SELECT * FROM ["+Worksheet+"$]", con);
DataTable dt = new DataTable();
da.Fill(dt);

where "SpreadSheetLocation" and "Worksheet" are a file path and work sheet names respectively. You can then convert the datatable rows to arrays if you like.

Update : you also don't need Excel installed in the machine for this solution...

For simple reading, I'd use the Excel Data Reader available on CodePlex.

Machines using the component don't need Excel installed and it's pretty easy to use. You can read a Worksheet into a DataSet.

SpreadshsetGear for .NET will allow you to open a workbook and access the raw values of cells (numbers, text, logical or error) or get the formatted text of the cells. Because SpreadsheetGear runs as part of your application rather than COM Interop as with Excel, it will run much faster (see the comments on this page to see what some of our customers have said about performance).

You can see live samples here and download the free trial here.

Disclaimer: I own SpreadsheetGear LLC

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top