Frage

I am developing a Spring app that has to search for persons using GPS coordinates. For each in the DB person I have his latitude and his longitude. The client pass a point and a max distance to the server which has to return all the clients around this point in this distance.

The problem is that the database contains about 300k persons, so getting all the persons and iterating through this list to chck if they are near the point is really slow.

Any idea to speed up this search ?

War es hilfreich?

Lösung

The best way to handle proximity searches is to start with some kind of bounding-rectangle approximation, and then go from there to an actual great-circle distance between people.

As long as your your latitudes aren't too near the poles, a sloppy but workable approximation for the distance between two points is this (in SQLish):

GREATEST(ABS(lat1-lat2),ABS(long1-long2))

If you want to be more precise and you know you only care about people who are within, let us say, 10 km each other you can use a bounding rectangle search like this.

WHERE latitude_from_table
    BETWEEN latpoint  - (10.0 / 111.045)
        AND latpoint  + (10.0 / 111.045)
  AND longitude_from_table
    BETWEEN longpoint - (10.0 / (111.045 * COS(RADIANS(latpoint))))
        AND longpoint + (10.0 / (111.045 * COS(RADIANS(latpoint))))

This works because there are 111.045 km in one degree of latitude. The cosine terms in the longitude bounds account for the fact that lines of latitude are closer together as you come near to the poles. This lets you exploit MySQL indexes on your latitude_from_table and longitude_from_table columns.

Once you have bounding-box proximity, you can apply a great circle distance formula. Here's background on that. http://www.plumislandmedia.net/mysql/haversine-mysql-nearest-loc/

For the kind of application you are considering, 32-bit IEEE-488 floating point is plenty of precision for your coordinates. If the points you are looking at are really close together (less than a km or so) you want to use the Vincenty formula (http://www.plumislandmedia.net/mysql/vicenty-great-circle-distance-formula/) rather than the more common so-called haversine formula (http://www.plumislandmedia.net/mysql/stored-function-haversine-distance-computation/).

If your number of people goes much above 300K, you may want to consider using the MySQL geospatial indexing scheme. It only works with MyISAM tables, but it is very fast at doing bounding-rectangle searches. See here. http://www.plumislandmedia.net/mysql/haversine-mysql-nearest-loc/

Andere Tipps

Can you use a stored procedure which performs the math and returns only the correct records? That would be much faster than pulling all that data over the wire.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top