How would MonetDB perform with multiple sql copy operations running at the same time?

StackOverflow https://stackoverflow.com/questions/9185538

  •  26-04-2021
  •  | 
  •  

Question

I need to import a large CSV file into MonetDB and I'm wondering if it would be possible to split the file in two and run two scripts like:

mclient -u monetdb -d mydb < import1.sql
mclient -u monetdb -d mydb < import2.sql

where

  • import1.sql issues a SQL copy instruction using file1.csv, and
  • import2.sql issues a SQL copy instruction using file2.csv

Would this be faster? Would this peform fine?

Thanks

Was it helpful?

Solution

MonetDB uses Optimistic Concurrency Control for concurrent transactions (i.e., any modifications to the data). That means that many threads can operate on the same data. However, write conflicts are not anticipated and avoided through, e.g., locking but only detected before committing the transaction (i.e., when all the actual work is done).

The scenario that you created is essentially the worst case for this strategy: two concurrent transactions that modify exactly the same data. Both of them will run for some time, one will be committed and the other one will be rolled back and then restarted.

The bottom line is: don't do it :-). What you can do is to append the "LOCKED" suffix to the copy into statements which can yield significant speedup in loading when running in single-user mode(see the MonetDB Documentation).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top