Question

I have an inventory table which contains item_id and the quantity remaining of the item (And also some other meta data). I have an administrator update the inventory by uploading a CSV file which contains the item_id and the quantity remaining.

  1. Run `update for each row in the CSV file. If my CSV file contains 1 million rows, I will end up sending 1 million update statements from my application server to the database server.
  2. Construct 1 million update queries and send them in a batch (JDBC allows batched statements)

At first glance approach number 2 looks like a better solution. But then can 1 million statements be batched? What happens if one of the statement fails for some reason?

Was it helpful?

Solution

The usual way is to import the CSV file into a staging table. Either an unlogged table which you only create once, or a temp table which you create immediately before the import.

Something along the lines:

create temp table inventory_import (item_id integer primary key, quantity integer);
copy inventory_import from '/path/to/file.csv' ... ;

update inventory i
  set quantity = im.quantity
from inventory_import im
where i.item_id = im.item_id;
Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top