Question

I'm working with latitude and longitude as index and compiled the r*tree module into the sqlite3 database to increase the performance. Further I loaded the tables into an in-memory database. The surprising result was that the ResultSet next method slows down to 25-30 ms instead of 1 ms if the data are coming from hard disk. Normally I expect 250 entry for the result set.

First a connection is set up to the memory.

Class.forName("org.sqlite.JDBC");
Connection connection = DriverManager.getConnection("jdbc:sqlite::memory:");

Then the tables are copied from the hard disk into the memory:

Statement s = connection.createStatement(); 
s.execute("ATTACH 'myDB.db' AS fs");
s.executeUpdate("CREATE VIRTUAL TABLE coordinates USING rtree(id, min_longitude, max_longitude, min_latitude, max_latitude)");
s.executeUpdate("INSERT INTO coordinates (id, min_longitude, max_longitude, min_latitude, max_latitude)
SELECT id, min_longitude, max_longitude, min_latitude, max_latitude FROM fs.coordinates");
s.executeUpdate("CREATE TABLE locations AS SELECT * from fs.locations");
s.execute("DETACH DATABASE fs");

The last step is query the database and copy the result into an object.

final String sql = "SELECT * FROM locations, coordinates WHERE (locations.id = coordinates.id) AND ((min_latitude >= ? AND max_latitude <= ?) AND (min_longitude >= ? AND max_longitude <= ?))";
PreparedStatement ps = connection.prepareStatement(sql);
// calculate bounding rec and fullfil sql statement. 
Result rs = ps.executeQuery();
while (rs.next()) {
   // copy the stuff here
}

I tried some stuff like the connection should be "read only" or using TYPE_FORWARD_ONLY and CONCUR_READ_ONLY or increasing the fetch size to perform the result set but rs.next() stays unimpressed.

Does anybody has an idea what's happening in the memory and why rs.next() is so slow? How could I increase the query?

The system uses the driver sqlite-jdbc 3.7.2 from org.xerial and tomcat6.

Update: Added the database schema

CREATE VIRTUAL TABLE coordinates USING rtree(
 id,
 min_latitude,
 max_latitude,
 min_longitude,
 max_longitude);

CREATE TABLE locations(
 id INTEGER UNIQUE NOT NULL,
 number_of_locations INTEGER DEFAULT 1,
 city_en TEXT DEFAULT NULL,
 zip TEXT DEFAULT NULL,
 street TEXT DEFAULT NULL,
 number TEXT DEFAULT NULL,
 loc_email TEXT DEFAULT NULL,
 loc_phone TEXT DEFAULT NULL,
 loc_fax TEXT DEFAULT NULL,
 loc_url TEXT DEFAULT NULL);
Was it helpful?

Solution

In the original database, the query first looks up coordinates with the R-tree index, and then looks up matching locations with the index (implied by UNIQUE) on the id column, as shown by this EXPLAIN QUERY PLAN output:

0|0|1|SCAN TABLE coordinates VIRTUAL TABLE INDEX 2:DaBbDcBd
0|1|0|SEARCH TABLE locations USING INDEX sqlite_autoindex_locations_1 (id=?)

In the in-memory database, the table schema has changed because:

A table created using CREATE TABLE AS has no PRIMARY KEY and no constraints of any kind.

This particular table now looks as follows:

CREATE TABLE locations(
  id INT,
  number_of_locations INT,
  city_en TEXT,
  zip TEXT,
  street TEXT,
  number TEXT,
  loc_email TEXT,
  loc_phone TEXT,
  loc_fax TEXT,
  loc_url TEXT
);

It is no longer possible to look up locations by ID efficiently, so the database does full table scans:

0|0|0|SCAN TABLE locations
0|1|1|SCAN TABLE coordinates VIRTUAL TABLE INDEX 1:

When copying databases, you should always use all the original CREATE TABLE/INDEX statements.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top