Вопрос

This ought to be basic, but I have a flat csv file with 6 million rows that I import to postgres 9.1 and things are all 'good', except that I am missing about 3 million records, silently, when import is done.

I initially thought there was a bad delimiter or end of file character that caused an abrupt stop around record 3.4 million or something similar, but the ids (native to the data, not auto-generated) suggests that the missed rows are not sequential, but scattered throughout the file.

I want to pre-process it in python or pandas, but my relative illiteracy with postgres with respect to logging errors of my COPY or \copy commands means I don't know which are the offending records.

Sorry this is not a reproducible example--hopefully someone here can point me in the right direction easily to logging errors/silently rejected tuples (and possibly the reason why?)--I saw the rejected patch, but there is presumably some way to do this with the existing tools.

Это было полезно?

Решение

You might want to try setting log_statement:

log_statement(ddl)

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top