Question

I have to read an Excel file with OleDB in a web application and save the data in a Database.

Accessing the file and reading it with DataAdapter or OleDbDataReader works. I needed to specify IMEX=1 and TypeGuessRows=0 because the data in the file has headers that I need to parse, but they are not on the first row. So basically, I need to read the lines until I hit a known header and then start parsing all the data after it.

In the first column there are UPC numbers with values like this: 5053366261702 But even though the fields are read as text, the OleDbDataReader returns the numbers in a scientific way like this: 5.05337E+12

If I don't read the lines as text, the numbers are returned correctly but the header will disappear.

I added the important part of the code. Thanks in advance for any help.

string conn = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source='" + fileName + "';Extended Properties='Excel 12.0;HDR=No;IMEX=1;ImportMixedTypes=Text;TypeGuessRows=0'";
using (OleDbConnection objConn = new OleDbConnection(conn))
{
      objConn.Open();
      var exceltables = objConn.GetOleDbSchemaTable(System.Data.OleDb.OleDbSchemaGuid.Tables, new Object[] { null, null, null, "TABLE" });
      var tablename = exceltables.Rows[0]["TABLE_NAME"];
      using (OleDbCommand objCmdSelect = new OleDbCommand("SELECT * FROM [" + tablename + "]", objConn))
      {
            using (OleDbDataReader reader = objCmdSelect.ExecuteReader())
            {
                while (reader.Read())
                {
                    string abc = reader[0].ToString(); //do some parsing
                }
            }                    
      }
}
Was it helpful?

Solution

I found a solution that works for me. I open the file with two different connection strings now.

Provider=Microsoft.ACE.OLEDB.12.0;Data Source='filename.xlsx';Extended Properties='Excel 12.0;HDR=No;IMEX=1;ImportMixedTypes=Text;TypeGuessRows=0'

First one to get the headers and when I found them I save the line number and open the file again with IMEX=0.

Provider=Microsoft.ACE.OLEDB.12.0;Data Source='filename.xlsx';Extended Properties='Excel 12.0;HDR=No;IMEX=0;ImportMixedTypes=Text;TypeGuessRows=0'

string connHeaders = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source='" + fileName + "';Extended Properties='Excel 12.0;HDR=No;IMEX=1;ImportMixedTypes=Text;TypeGuessRows=0'";
string connData = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source='" + fileName + "';Extended Properties='Excel 12.0;HDR=No;IMEX=0;ImportMixedTypes=Text;TypeGuessRows=0'";

int dataStartRow = 0;
string tablename = "";

#region Open file to find headers
using (OleDbConnection objConn = new OleDbConnection(connHeaders))
{
    objConn.Open();
    var exceltables = objConn.GetOleDbSchemaTable(System.Data.OleDb.OleDbSchemaGuid.Tables, new Object[] { null, null, null, "TABLE" });
    tablename = exceltables.Rows[0]["TABLE_NAME"].ToString();
    using (OleDbCommand objCmdSelect = new OleDbCommand("SELECT * FROM [" + tablename + "] ", objConn))
    {
        using (OleDbDataReader reader = objCmdSelect.ExecuteReader())
        {
            while (reader.Read())
            {
                if (reader[0].ToString().ToLower() == "upc")
                {
                    for (int i = 0; i < reader.FieldCount; i++)
                    {
                        //find all necessary headers
                    }
                    break;
                }
                dataStartRow++;
            }
        }
    }
}
#endregion

#region Open file again to read data
using (OleDbConnection objConn = new OleDbConnection(connData))
{
    objConn.Open();
    using (OleDbCommand objCmdSelect = new OleDbCommand("SELECT * FROM [" + tablename + "] ", objConn))
    {
        using (OleDbDataReader reader = objCmdSelect.ExecuteReader())
        {
            for (int i = 0; i < dataStartRow; i++) reader.Read();
            while (reader.Read())
            {
                //read the line to save it in Database
            }
        }
    }
}

OTHER TIPS

I have a simple answer that works for me.

File.WriteAllText(file, Regex.Replace( File.ReadAllText(file), "(?<=,)([0-9]{12,})(?=,)", "\"$1\""));

This works because it puts double quotes around any field with 12 or more digits which is when the scientific notation comes in.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top