Generally for this type of situation I'd have a separate staging table that does not have the PK constraint, which I'd populate using COPY
(assuming the data were in a format for which it makes sense to do a COPY
). Then I'd do something like:
insert into table
select a.*
from staging a
where not exists (select 1
from table
where a.id = b.id)
That approach isn't too far off from your original design.
I don't totally understand this part of your question, however, which doesn't even seem totally relevant to your question:
this approach unfortunately still doesn't work - because every single statement in postgreSQL is committed separately.
That's not true at all, not for any RDBMS. Sure, auto-commit might be enabled on your client, but that doesn't mean that postgres commits every statement separately and that you can't disable the auto-commit. This approach would work:
begin;
insert into table (id) select 1 where not exists (select 1 from table where id = 1);
insert into table (id) select 2 where not exists (select 1 from table where id = 2);
insert into table (id) select 3 where not exists (select 1 from table where id = 3);
commit;
As you pointed out, however, if you've got more than a handful of such statements you'll quickly be hitting some performance concerns.