Question

I am using the following command to load data into SQL Server:

INSERT INTO [NewTable]

SELECT * FROM OPENROWSET 
(
'MSDASQL', 'Driver={Microsoft Text Driver (*.txt; *.csv)};DBQ=c:\SomeFolder\;'
, 'SELECT * from [SomeFile.csv]'
);

The problem is that apparently the driver tries to guess the datatype for each field, and where the cast fails it simply reads in a null. For example, lets say I have the following

SomeCode   SomeName
100        A
299        B
22         C
123        D
ABC        E
900        F

It seems to figure that "SomeCode" is an integer, and it will read "ABC" as NULL. Is there any way I can stop this from happening. All I want is for the data to be handled as varchars all the way through.

Any ideas?

Was it helpful?

Solution

Take a look at the second link in my answer on this question about registry keys that control how JET infers types.

You may also want to make sure the ImportMixedTypes key is set to Text.

HKLM\Software\Microsoft\Jet\4.0\Engines\Excel\ImportMixedTypes

You might have to substitute in something else for Excel, however.

OTHER TIPS

It'll scan only a few rows to determine most probable datatype. This is an issue in scenarios like yours. However you can use a format file together with OPENROWSET.

Details on how to format your FORMATFILE when reading text files. http://msdn.microsoft.com/en-us/library/ms191175.aspx

In your case:

Make a formatfile.xml containing:

<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <RECORD>
  <FIELD ID="1" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="100" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
  <FIELD ID="2" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="100" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
 </RECORD>
 <ROW>
  <COLUMN SOURCE="1" NAME="Col1" xsi:type="SQLNVARCHAR"/>
  <COLUMN SOURCE="2" NAME="Col2" xsi:type="SQLNVARCHAR"/>
 </ROW>
</BCPFORMAT>

Change your query to:

BULK INSERT [newTable]
FROM 'C:\somefile.csv' 
WITH (formatfile='C:\formatfile.xml');

It took me a while to find what I was looking for so adding it here since this is one of the top results:

If you are using Microsoft.ACE.OLEDB and are having this issue then you need to add the option "IMEX=1;" (without the quotes) to the data source.

Example:

SELECT * INTO #temp FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0','Excel 12.0 Xml;HDR=YES;IMEX=1;Database=P:\Data\FileName.xlsx' ,'SELECT * FROM [Sheet1$A1:BB100]')

This will read mixed data as text. Hope this helps.

a shortcut to solve this problem is to use "HDR=No". in this case, because of the header datatType (Text), the default value of the columns whoud be text. at the end, you could simply filter the header row. in your case:

INSERT INTO [NewTable]

SELECT * FROM OPENROWSET 
(
'MSDASQL', 'Driver={Microsoft Text Driver (*.txt; *.csv)};DBQ=c:\SomeFolder\;'
;HDR=Noe, 'SELECT * from [SomeFile.csv]'
)where [F1] <> 'SomeCode';
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top