Question

I have the following database:

CREATE TABLE person_bornYear (name, year INT, prob FLOAT);

And I have a vector with object (PersonYear) that contains person elements: String name, int year, double prob.

I try to insert the vector elements to the database line by line:

private Statement _stat;
private Connection _conn;
private PreparedStatement _prep;
for (PersonYear py : vecToInsert) {
    this.set_prep(this.get_conn().prepareStatement("INSERT into person_bornYear values (?, ?, ?);"));
    this.get_prep().setString(1, py.get_person());
    this.get_prep().setString(2, Integer.toString(py.get_year()));
    this.get_prep().setString(3, Double.toString(py.get_accuracy()));
    this.get_prep().executeUpdate();
}

And it takes 2-3 minutes (the vector contains 100K elements).

Does someone can tip me a faster way to insert the vector elements into the DB?

Thanks in advance.

Was it helpful?

Solution

You can execute a simple batch query exacly as in the example here: http://www.mkyong.com/jdbc/jdbc-preparedstatement-example-batch-update/

OTHER TIPS

Two quick things that should speed up your code significantly:

  1. Don't recreate your prepared statement for each iteration. It's a fairly expensive operation, and the object you get back is reusable.
  2. You can batch up multiple calls to the prepared statement, to reduce the number of calls made to the database.

This code is untested, modify as needed:

this.set_prep(this.get_conn().prepareStatement("INSERT into person_bornYear values (?, ?, ?);"));
for (PersonYear py : vecToInsert) {
    this.get_prep().setString(1, py.get_person());
    this.get_prep().setString(2, Integer.toString(py.get_year()));
    this.get_prep().setString(3, Double.toString(py.get_accuracy()));
    this.get_prep().addBatch();
}

this.get_prep.executeBatch();

First of all you are instantiating the same prepared statement through every iteration. Instead you would probably gain some speed if you instantiated it before the loop and then reused it.

Second since you are doing a whole bunch of operations at the same time you could use bulk insertion instead: Efficient way to do batch INSERTS with JDBC

Something like this:

PreparedStatement stmt = this.get_conn().prepareStatement("...");
for (...) { 
  ... 
  stmt.addBatch();
  stmt.clearParameters();
}
stmt.executeBatch();

Third: Why is it so important to insert them so fast? If the software doesn't depend on the data, you could consider doing it with threads. That would allow the primary application to continue while the database chews on your vector-data.

Depending on your database backend you could also split up the vector and insert the data concurrently in different threads. If your backend have a proper MVCC it would save you a lot of time.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top