문제

I have an application which generates a lot of data which needs to be inserted quickly (something around 13million records). I use JPA 2.0/Hibernate with Postgres 9.1, and I managed to achieve quite a good performance (around 25k inserts per second) with multi-threading and batching of inserts every few thousand inserts or so, completing a whole run in around 8mins.

However, I noticed that I had a few of the foreign keys which had an index missing, which I would really wish to have both from an analysis point of view to drill down in the data, and also to delete data to a specific run. Unfortunately when I added in these 3 indexes to the table that is getting most inserts, performance dropped down drastically to around 3k per second.

Is there any way to avoid this performance slow down? I know that one option is to drop the indexes before a run and recreate them in the end. Another more clumsy option is to generate the data of the biggest table in a file instead and use COPY. I guess I can only do it on the largest table in the relation, due to the foreign key values which I would need to know (generated through sequences).

Both alternatives seem to be hacks. Is there any other solution, maybe a bit less intrusive on the application? Some setting to tell postgres to defer indexing or something of that sort?

Any ideas welcome.

올바른 솔루션이 없습니다

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 dba.stackexchange
scroll top