Question

What is difference between these three solutions in terms of batch-update definition and performance:

1)

Insert into table1 values ('A','A1'),('B','B1'),('C','C1'), ....

2)

Transaction.begin();

insert into table1 values('A','A1'); insert into table1
values('B','B1'); insert into table1 values('C','C1'); ... ... ...

Transaction.commit();

3)

PreparedStatement.addBatch(...);
PreparedStatement.executeBatch();

OR Set Batch=true in MyBatis configuration

No correct solution

OTHER TIPS

Assuming that your database takes syntax 1 as valid (MS-SQL 2008 for example) and assuming #1 and #2 are both in native database's SQL implementation...

Then for performance part of your question...

#1 is the fastest because it will be parsed/prepared once and executed once as a single atomic transaction.

#2 will be next in performance. There are two statements (txn begin) and (txn end) as well as one statement each for each insert.

#2a (something I suggest) works faster. It is when, depending on your server's supported syntax, you "prepare" a parametrized SQL and repeatedly call the prepared statement (execute) with each of your values as parameters. This way the actual statement is parsed only once. E.g.

  Transaction.begin();  
  Stmt.Prepare("insert into table1 values(:Var1, :Var2)");  
  Stmt.Execute('A','A1');  
  Stmt.Execute('B','B1');  
  Stmt.Execute('C','C1');  
  Transaction.commit();  

#3 is an based on an ORM framework so it will have it's own overhead and will be slower than for #1 and #2a. But could be faster than #2 depending on how it's implemented. In rare cases #3 could be even faster than both #1 and #2a, "IF" the ORM framework internally is intelligent enough to change such repeated inserts into database specific bulk load calls.

For batch-update definition part of your question...

That's also basically a multi-part decision....
#A a subjective choice of taste.
#B can it do the job at hand..

I personally like the #2a because it is pretty much as fast as #1 but more readable and also able to take on a much larger dataset, or can be put in a loop, reading values from a stream/file etc. and inserting them, whereas #1 is sometimes limited by maximum SQL statement size of a particular DBMS implementation. #3 and other variations really depend on the ORM framework you are using so it's hard to get any more specific.

This is a broad question, so I've kept the answer broad also. If anything specific is not clear, comment please and I'll be happy to expand.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top