Question

I have database with about 12 000 entries. Each entry has given latitude, longitude and empty distance. What I need to do is to find 25 nearest entries from current GPS position. My ORM is greenDao.

Have 2 problems: I don't know distance between me and entries yet and I can't load all entries to RAM because when I do, heap goes up to 70MB and app crashes at OutOfMemoryException (so I need to use lazy loading).

I tried this approach:

  1. Get iterator for given table
  2. load entry, calculate its distance from my current position, save entry to ArrayList buffer (I flush buffer every 1000 entries back into db (it is just updateInTx(...)) and then clean it)
  3. repeat point 2 until iterator.hasNext();
  4. query from entries with limit(25).orderAsc()
  5. result

This works, but from point 1-3 it is very very slow (takes about 25 sec on Nexus 7) . Rest takes about 1.5 sec.

I have to do this every time when user launches app or requests data refresh. Any ideas how to solve it better way?

Thanks

EDIT: This is function for calculating distance, so its hard to make this in SQL :(

double getDistance(GPSCoords myPos, Place place) {
    double dlong = (place.getLongitude() - myPos.getLongitude()) * d2r;
    double dlat = (place.getLatitude() - myPos.getLatitude()) * d2r;
    double a = Math.pow(Math.sin(dlat / 2.0), 2) + Math.cos(myPos.getLatitude() * d2r)
            * Math.cos(place.getLatitude() * d2r) * Math.pow(Math.sin(dlong / 2.0), 2);
    double c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1 - a));
    double d = 6367 * c;

    return d;
}
Was it helpful?

Solution 2

I don't see why exactly you feel you need to lazy load your entries. The 70MB heap number sounds pretty suspicious with only 12k entries. Are you grabbing the entire row just to calculate distance? Try just grabbing the columns you need:

  • Latitude
  • Longitude
  • Primary Key

Assuming each is 8 bytes apiece, that's 24 * 12000 bytes, or roughly 280 kilobytes. Give it some overhead room for just being Java, but you're still looking at something very manageable.

Then you can do the calculations in code, and just have it spit out the primary key for each of the closest points. A second query can grab just those 25(the entire row this time), and you're done!

OTHER TIPS

You should be able to let SQL do the work inside the database:

select ((x - ?)*(x - ?) + (y - ?)*(y - ?)) as distsq from entries 
order by dist limit 20

Unfortunately sqlite doesn't provide exponentiation, so the duplicated terms are needed.

If this is still not fast enough, another approach would be to make bounding box queries centered on your location, adjusting the size of the bounding box by binary search until you have 30 or a few more entries. Indexes on each of the x and y dimension will speed these along.

Edit Since the OP says earth curvature is important, a bounding box technique is probably the best approach we can get with unextended sqlite. Here is a proposed algorithm:

Let P be the current position
Let Slat = lat0 be the bounding box latitude half-size initialized with a "best guess"
Let Slon = lon0 be the bounding box longitude half-size initialized with a "best guess"
// NB the best guesses should cover an approximately square area on the ground
loop
  Let W = P.lon - Slon, E = P.lon + Slon, N = P.lat + Slat, S = P.lat - Slat
  C = select count(*) from entries
      where W <= lon and lon <= E and S <= lat and lat <= N
  if C indicates the result is too big (e.g. for memory or read time), 
    Slat = 0.5 * Slat
    Slon = 0.5 * Slon
  else
    Let R be the result of the same query for * instead of count(*)
    Let D be the geometric distance from P to the nearest point on bounding box
    Compute r.dist for all r in R (in memory)
    Sort R by dist (in memory)
    Throw away the tail elements of R where r.dist > D 
       // Can't use these because points outside bounding box might be closer!
    If at least 20 remaining R elements, 
      return top 20
    else
      Slat = 2 * Slat
      Slon = 2 * Slon
    end if
  end if
end loop    

Note you need indices for lat and lon. I don't know how good the SQLite query optimizer is in this case. A good optimizer will pick either the lat or lon index based on statistics accumulated from past queries, use this to quickly find all points in the bounding box range for that dimension, then do a scan of this result to get the final. If the optimizer is not that clever, you want to index only the dimension likely to produce the smallest initial result: in the average case this is the one with greatest geometric extent (distance covered).

The r* tree index will make the bounding box queries much faster, but at least through Jelly Bean, you'd have to provide your own SQLite instance with this extension included. Perhaps later Android versions included it? I don't know.

Additionally, if you get to the point of including a custom SQLite with the app, it would be pretty easy to add the distance (with curvature) function as an extension.

There are a lot of examples of distance calculation using different flavors of SQL out there. Loading every row from your DB and calculating how far it is, then sorting and taking the closest is going to be slow just from the back and forth to the database. Doing the calculation in SQL and only retrieving the ones you need is going to be much more performant.

You can try moving the distance calculation to the sql db. you can also put some smarter code, that will run distance calculation until he finds 25 places that their distance from current location is less than x(you choose). or even less than 25 items (maybe you just need 7 to fill the screen) and than continue the calculation in background when user is already in the app. It'll be a much better user experience.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top