Question

How can i copy a CSV file from S3 bucket to Redshift and avoid duplicate rows ? i have read about the copy command but didn't find any documentation on using it through the PHP SDK .

Was it helpful?

Solution

PHP SDK is designed to run administrative tasks (the same as from Web Console).

In order to load data simply connect to a database using PostgreSQL connector and run a COPY query.

Avoiding duplicate rows is another issue - currently redshift does not enforce UNIQUE constraints, so any row that is present in source file will be added into target table (even if same UNIQUE value already exists).

Documentation gives some hints how to import only new rows like:

  1. COPY data into temp_table;

  2. insert only NEW data:

INSERT INTO dest_table (
  SELECT * from temp_table
  WHERE key NOT IN (
    SELECT key FROM dest_table
  )
)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top