Question

I've been tasked with profiling a postgresql database. The first requirement is to see how fast records can be added, with all possible external bottlenecks removed, in order to find our theoretical limit.

At first I created a csv file with sample data and read it in with the COPY function. Now, all records are added via a function update_or_add(). Is it possible to use COPY along with update_or_add() or is there a better solution that I haven't considered?

Was it helpful?

Solution

Rather than "for each row call update_or_add() on the row", your preferred approach should be to generalize "update_or_add" to work on all rows in a set.

COPY the data from the external source into a TEMPORARY or UNLOGGED table. Then write a query that merges the data from the copied table into your main table, like update_or_add() does but for all rows at once. If concurrent modification of the main table is possible this may require locking the main table with LOCK TABLE main_table IN EXCLUSIVE MODE so that other transactions can only SELECT from it while the merge is going on.

Without your schema or knowing what update_or_add does it's hard to say more. At a guess it's an upsert/merge function, in which case you can do its job massively more efficiently by locking the main table and doing a whole-set merge.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top