Parsing an Excel file in C#, the cells seem to get cut off at 255 characters… how do I stop that?

StackOverflow https://stackoverflow.com/questions/926453

Question

I am parsing through an uploaded excel files (xlsx) in asp.net with c#. I am using the following code (simplified):

string connString = string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileLocation + ";Extended Properties=\"Excel 12.0 Xml;HDR=YES\";");
OleDbDataAdapter adapter = new OleDbDataAdapter("SELECT * FROM [Sheet1$]", connString);
DataSet ds = new DataSet();
adapter.Fill(ds);
adapter.Dispose();
DataTable dt = ds.Tables[0];
var rows = from p in dt.AsEnumerable() select new { desc = p[2] };

This works perfectly, but if there is anything longer than 255 characters in the cell, it will get cut off. Any idea what I am doing wrong? Thank you.

EDIT: When viewing the excel sheet, it shows much more than 255 characters, so I don't believe the sheet itself is limited.

Was it helpful?

Solution

Just from a quick Googling of the subject, it appears that that's a limit of Excel.

EDIT: Possible workaround (unfortunately in VB)

OTHER TIPS

The Solution!

I've been battling this today as well. I finally got it to work by modifying some registry keys before parsing the Excel spreadsheet.

You must update this registry key before parsing the Excel spreadsheet:

// Excel 2010
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\14.0\Access Connectivity Engine\Engines\Excel\
or
HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Microsoft\Office\14.0\Access Connectivity Engine\Engines\Excel\

// Excel 2007
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\12.0\Access Connectivity Engine\Engines\Excel\

// Excel 2003
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel\

Change TypeGuessRows to 0 and ImportMixedTypes to Text under this key. You'll also need to update your connection string to include IMEX=1 in the extended properties:

string connString = string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileLocation + ";Extended Properties=\"Excel 12.0 Xml;HDR=YES;IMEX=1\";");

References

http://blogs.vertigo.com/personal/aanttila/Blog/archive/2008/03/28/excel-and-csv-reference.aspx

http://msdn.microsoft.com/en-us/library/ms141683.aspx

...characters may be truncated. To import data from a memo column without truncation, you must make sure that the memo column in at least one of the sampled rows contains a value longer than 255 characters, or you must increase the number of rows sampled by the driver to include such a row. You can increase the number of rows sampled by increasing the value of TypeGuessRows under the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel registry key....

I have came across this, and the solution that worked for me was to move the cells with long text to the top of the spreadsheet.

I found this comment in a forum describing the issue

This is an issue with the Jet OLEDB provider. It looks at the first 8 rows
of the spreadsheet to determine the data type in each column. If the column does
not contain a field value over 256 characters in the first 8 rows, then it assumes the
data type is text, which has a character limit of 256. The following KB article has
more information on this issue: http://support.microsoft.com/kb/281517

Hope this help someone else!

Have you tried setting the columns datatype to text within the spreadsheet? I believe doing this will allow the cells to contain much more than 255 characters.

[Edit] For what it's worth this dialog with the MS-Excel team is an interesting read. In the comments section at the bottom they get into some discussions about that 255 cutoff. They say Excel 12 can support 32k characters per cell.

If that is true there must be a way to get at this data. Here is two things to consider.

  1. In the past I have used the "IMEX=1" option in my connection string to deal with columns containing mixed data showing up as empty. It's a longshot, but you might give that a try.

  2. Could you export the file to a tab delimited flat file? IMHO this is the most reliable way of dealing with Excel data, since Excel does have so many gotchas.

SpreadsheetGear for .NET can read and write (and more) xls and xlsx workbooks and supports the same limitations as Excel for text - in other words it will just work. There is a free evaluation if you want to give it a try.

Disclaimer: I own SpreadsheetGear LLC

Regarding the last post, I also use SpreadsheetGear and find that it also suffers from the 255 characters per cell limitation when reading from the older XLS (not XLSX) format.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top