I am new to Redshift, and I found this article looking for a common sequence, that is not supported on Amazon database. I found this solution I will report with a complete example using ROW_NUMBER.
I have schemas sta and dim. In sta I have staging tables, while in dim I have dimension tables I want to populate with ids. I have a source of information that has fields trk_key, name containing for instance some publishers.
CREATE TABLE sta.publisher (
trk_key VARCHAR(20),
name VARCHAR(225)
);
CREATE TABLE dim.publisher (
id SMALLINT,
trk_key VARCHAR(20),
name VARCHAR(255),
PRIMARY KEY (id)
);
First I truncate sta.publisher table and load there a csv file. Then I launch the following query
-- This query is idempotent:
-- it will insert a publisher found in sta.publisher table only if
-- it is not already in dim.publisher table.
INSERT INTO dim.publisher
SELECT
-- Generate id using max id found in dim.publisher.
-- Start with id=1 if dim.publisher is empty.
(
SELECT NVL(MAX(id), 0)
FROM dim.publisher
) + ROW_NUMBER() OVER() AS id,
trk_key,
name
FROM sta.publisher
-- Only insert record if trk_key is not found in dim.publisher table.
WHERE trk_key NOT IN (
SELECT trk_key
FROM dim.publisher
)