Question

I would like to know how I can import my data to table. I know the COPY command and the option HEADER. But the file I have to import has the following format:

Line 1: header1, header2, header3,... 
Line 2: vartype, vartype, vartype,... 
Line 3: data1, data2,...

Like you can see, I need to skip the second line too. For example:

"phonenumber","countrycode","firstname","lastname"
INTEGER,INTEGER,VARCHAR(50),VARCHAR(50)
123456789,44,"James","Bond"
5551234567,1,"Angelina","Jolie"
912345678,34,"Antonio","Banderas"

The first line is the exact name of the table's columns. I have tried to use the INSERT INTO command but I have not got good result.

Was it helpful?

Solution

I am using these two strategies for this type of problem:

1) Import all

  • import all rows into temporary table where columns have varchar type
  • delete rows you do not want
  • insert data into final table, cast varchar to desired types

2) Pre-process

  • delete rows from imported file
  • import

For your case, you can delete 2nd line using sed for example:

sed -i '2d' importfile.txt

This will remove 2nd line from file named importfile.txt. Note that flag -i will overwrite the file immediately, so use it with care.

You can use this to delete range of lines:

sed -i '2,4d' importfile.txt

This will remove lines 2, 3, 4 from file.

OTHER TIPS

If you are working in a Linux shell you could always just stream in the records you want, eg

tail -[number of lines minus header] <file> | psql <db> -c "COPY <table> FROM STDIN CSV;"

or if your header is marked by say "#"

grep -v "^#" <file> | psql <db> -c "COPY <table> FROM STDIN CSV;"

You'll have to pre-process the file I'm afraid. There are far too many strange formats (like this one) around for COPY to understand - it just concentrates on handling the basics. You can trim the second line out with a simple bit of sed or perl.

perl -ne 'print unless ($.==2)' source_file.txt
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top