Question

I have a table with Customer/Phone/City/State/Zip/etc.. Occasionally, I'll be importing the info from a .csv file, and sometimes the zipcode is formatted like this: xxxxx-xxxx and I only need it to be a general, 5 digit zip code.

How can I delete the last 5 characters without having to do it from Excel, cell by cell (which is what I'm doing now)?

Thanks

EDIT: This is what I used after Craig's suggestion and it worked. However, some of the zip entries are canadian zipcodes and often time they are formated x1x-x2x. Running this deletes the last character in the field.

How could I remedy this?

Was it helpful?

Solution

You'll need to do one of these 3 ideas:

  • use an ETL tool to filter the data during insert;
  • COPY into a TEMPORARY or UNLOGGED table then do an INSERT INTO real_table SELECT ... that transforms the data with a suitable substring(...) call; or
  • Write a simple Perl/Python/whatever script that reads the csv, transforms it as desired, and inserts the results into PostgreSQL. I'd use Python with the csv module and psycopg2's copy_from.

Such an insert into ... select might look like:

INSERT INTO real_table(col1, col2, zip)
SELECT
  col1,
  col2,
  substring(zip from 1 for 5)
FROM temp_table;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top