Domanda

I have a file with about 1,000,000 lines of fixed width data in it.

I can read it, parse it, do all that.

What I don't know is the best way to put it into a SQL Server database programmatically. I need to do it either via T-SQL or Delphi or C# (in other words, a command line solution isn't what I need...)

I know about BULK INSERT, but that appears to work with CSV only.....?

Should I create a CSV file from my fixed width data and BULK INSERT that?

By "fastest" I mean "Least amount of processing time in SQL Server".

My desire is to automate this so that it is easy for a "clerk" to select the input file and push a button to make it happen.

What's the best way to get the huge number of fixed width records into a SQL Server table?

È stato utile?

Soluzione

I assume that by "fastest" you mean run-time:

The fastest way to do this from compiled code is to use the SQLBulkCopy methods to insert the data directly into your target table. You will have to write your own code to open and read the source file and then split it into the appropriate columns according to their fixed-width offsets and then feed that to SQLBulkCopy. (I think that I have an example of this somewhere, if you want to go this route)

The fastest way to do this from T-SQL would be to shell out to DOS and then use BCP to load the file directly into your target table. You will need to make a BCP Format File that defines the fixed-width columns for this appraoch.

The fastest way to do this from T-SQL, without using any CLI, is to use BULK INSERT to load the file into a staging table with only one column as DATA VARCHAR(MAX) (make that NVARCHAR(MAX) if the file has unicode data in it). Then execute a SQL query you write to split the DATA column into its fixed-width fields and then insert them into your target file. This should only take a single INSERT statement, though it could be a big one. (I have an example of this somewhere as well)

Your other 'fastest' option would be to use an SSIS package or the SQL Server Import Wizard (they're actually the same thing, under the hood). SSIS has a pretty steep learning curve, so it's only really worth it if you expect to be doing this (or things like this) for other cases in the future as well.

On the other hand, the Wizard is fairly easy to use as a one-off. The Wizard can also make a schedulable job, so if you need to repeat the same thing every night, that's certainly the easiest, as long as it actually works on your case/file/data. If it doesn't then it can be a real headache to get it right, but fixed-width data should not be a problem.

The fastest of all of these options has always been (and likely will always be) BCP.

Altri suggerimenti

I personally would do this with an SSIS package. It has the flexibility to handle a fixed width defintion.

If this is one time load, use the wizard to import the data. If not. create a package yourself and then schedule it to run periodically.

What I do is load an IDataReader that is wired to the import file.

Then I loop over the IDataReader, validate each row, sometimes massage the data in each row, then push that into Xml (or a DataSet and piggy back of the ds.GetXml() method).

Then every so many rows (every 1,000 let's say), I push them down to a stored procedure that can handle an xml input.

If a single row fails validation, I log it for later. (If I had 1,000,000 rows, and its ok to miss one so I have 999,999 rows properly imported, I handle the errant entry later).

If my bulk insert xml fails (with 1,000 rows in it), I log that entire xml. You could go over a failed set (of 1,000) and import those 1 by 1, and log the bad ones that way I guess. Aka, do 1,000 at a time, until 1,000 fails, then do them 1 by 1.

I have an example written here:

http://granadacoder.wordpress.com/2009/01/27/bulk-insert-example-using-an-idatareader-to-strong-dataset-to-sql-server-xml/

You have a number of choices, but depends what you mean by fastest. Fastest for one completion timed from I'll do it now? There is a wizard in SQL managment studio. Fastest to do it on a monthly basis with minimum learning curve. There is the DTS wizard in SQL Managment studio. Minimum SQL engine cycles for doing it every night? SSIS http://en.wikipedia.org/wiki/SQL_Server_Integration_Services

The bulk insert or bcp is the fastest way to do this, because it's an unlogged operation. You can easily be able to insert 10k rows per second, from my experience.

In order to bulk insert fixed width data, you need to create a bulk copy format file:

http://msdn.microsoft.com/en-us/library/ms178129.aspx

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top