Question

I am trying count the no. of truncation errors that happens when bcp import is performed. I have tried a simple logic in which I redirect the output of bcp and then grep right truncation in that file. Following is the code snippet :

bcp Test..Table in datafile.txt -f format_file -m 0 -S server -T > error_file.txt
error_count=`cat error_file.txt | grep -c ".*right truncation.*" `

The problem is that grep takes too much time when there are many rows and moreover it takes more time even when there is no right truncation error. Is there a better way for this? I am using bcp utility in windows under cygwin and importing it into MS SQL server 2008.

Was it helpful?

Solution 2

I researched on bcp and found the -e switch which directly writes any errors in an error log file and greping them directly from that log gives a significant performance boost.

Here is the output of bcp in command :

Starting copy...
1000 rows sent to SQL Server. Total sent: 1000
1000 rows sent to SQL Server. Total sent: 2000
1000 rows sent to SQL Server. Total sent: 3000
1000 rows sent to SQL Server. Total sent: 4000
1000 rows sent to SQL Server. Total sent: 5000
1000 rows sent to SQL Server. Total sent: 6000
1000 rows sent to SQL Server. Total sent: 7000
1000 rows sent to SQL Server. Total sent: 8000
1000 rows sent to SQL Server. Total sent: 9000
1000 rows sent to SQL Server. Total sent: 10000
1000 rows sent to SQL Server. Total sent: 11000
1000 rows sent to SQL Server. Total sent: 12000
SQLState = 22001, NativeError = 0
Error = [Microsoft][ODBC Driver 11 for SQL Server]String data, right truncation

12406 rows copied.
Network packet size (bytes): 4096
Clock Time (ms.) Total     : 1351   Average : (9182.8 rows per sec.)

[INFO BCP IN: data2.txt] Import failed for 1 rows out of 12407 rows, Total 0.0%.

Fri Sep 20 01:24:39 CDT 2013

[INFO BCP IN: data2.txt] Total time elapsed 0 days 0 hour 0 minutes 2 seconds

Here is the error log output :

#@ Row 12407, Column 1: String data, right truncation @#
The Bulk Copy Program (BCP) is a command-line utility

So modifying the code to grep error log file instead of the whole output of bcp in we can get a very good performance boost:

bcp Test..Table in datafile.txt -f format_file -m 0 -S server -T -e error_file.txt
error_count=`fgrep -c "right truncation" error_file.txt `

fedorqui, TrueY and pgl thanks for suggestions.

OTHER TIPS

The biggest speedup you are going to get here is to pipe everything together. ie, instead of your command, you would have:

error_count=$(bcp Test..Table in datafile.txt-f format_file -m 0 -S server -T | fgrep -c "right truncation")

(The above includes the already-suggested change to remove ".*" from your regex, and use fgrep.)

This avoids writing something to disk and then reading it all again to search for your right truncation.

Last minor point: I think the command you included in your post is missing a space (datafile.txt-f)?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top