Normally, when I import data from external sources, I am very wary of problems in the data. SQL Server offers several solutions. Many people use SSIS. I avoid SSIS. One of the reasons is getting it to open an Excel file that is already opened by a user. It has a few other shortcomings as well.
In any case, once the data is in a text file, I create a staging table that has all the columns of the original table with the data type varchar(8000)
. This tends to be larger than necessary, but should be sufficient.
Then, I create a table with the proper columns, and population it using something like:
insert into RealTable (CharColumn, IntColumn, FloatColumn, DateTimeColumn)
select CharColumn,
(case when isnumeric(IntColumn) = 1 and IntColumn not like '%.%' then cast(IntColumn as int end),
(case when isnumeric(FloatColumn) = 1 then cast(FloatColumn as float) end),
(case when isdate(DateColumn) = 1 then cast(DzteColumn as date)
from StagingTable st
That is, I do the type checks in SQL code, using a case
statement to avoid errors. The result are NULL values when the types don't match. I can then investigate the values in the database, using the StagingTable, to understand any issues.
Also, in the RealTable, I always have the following columns:
- RealTableId int identity(1,1)
- CreatedBy varchar(255) default system_user,
- CreatedAt datetime default getdate()
These provide tracking information about the data that often comes in useful.