Question

I am having a problem importing data from csv into a MySQL table. I am attempting to use "Load Data Infile" but every time I attempt to run my code I get

"Error Code: 1265. Data Truncated for column 'DIP20' at row 237"

The problem stems from the fact that column DIP20 at row 237 is the first null entry in the csv file, but my research suggests that null entries should be read as 0s by MySQL. This is stopping the whole import from running and no data is making its way into my table. I have been trying to find a way to instruct MySQL to accept Nulls but have not been able to find anything.

Other threads I have noticed in this area have suggested ammanding the source data to put a '\N' into every null, but this really isn't practical for a couple of reasons. First I have a number of Terrabytes of data to process, and second, I have to leave this database with other people when I am finished developping it and none of them will have the time, ability or inclination to edit data when more is recieved in the future.

If anyone could suggest a way to get this import to run without falling down on the nulls I would be very grateful.

The code I am trying to run is:

LOAD DATA INFILE '\\\\server\\path\\morepath\\file.csv'
INTO TABLE deidata.tbl_HHDataImport
FIELDS TERMINATED BY ',' ESCAPED BY '\\'
LINES TERMINATED BY '\r\n' STARTING BY ''
IGNORE 1 LINES

The table structure is as follows:

table tbl_HHDataImport
(
CNF_ID  VARCHAR(10)  PRIMARY KEY,
Read_date  Datetime,
DIP1 float,
DIP2 float,
//...{48 DIP columns here}...
DIP47 float,
DIP48 float 
)

(This is indended to be a staging table from which I will transform the data into a proper relational structure. This is the format of the data I am recieving which and I cannot alter.)

I am used to developing DBs in MS SQL Server but am currently working for a slightly cash-strapped, non-profit organisation so I've been asked to work with MySQL. I thought I was getting on OK with it until I hit this problem. I am using MySQL 5.6.13 and MySQL workbench 6.0.

Thanks in advance

Tom

Was it helpful?

Solution

I am a big fan of loading data first into staging tables and then doing the type conversions in the database.

That is, create a staging table that has all the same fields, but defined as varchar(255) or nvarchar(255) (depending on contents of the csv file).

This should load correctly, with no type conversion errors.

Then do something like:

insert into tbl_HHDataImport(Read_Date, DIP1,  . . . )
    select now(), cast(DIP1 as float), . . . 
    from tbl_HHDataImport_staging;

When you have a conversion problem, you will readily be able to determine it. My guess is that the code should be like:

insert into tbl_HHDataImport(Read_Date, DIP1,  . . . )
    select now(),
           (case when DIP1 <> 'NULL' then cast(DIP1 as float) end), . . . 
    from tbl_HHDataImport_staging;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top