Question

I am trying to find the distance to the nearest bike rack to the location of a crime. I have a database table with 5 million crimes and 5,000 bike racks. What I need to do now is find the nearest bike rack to each of the 5 million crimes. I tried creating a python script using geopy but it was simply too time consuming in processing (would have taken weeks at least). I have been searching online extensively, I have found that SQL Server 2012 has a geography datatype so I created a column containing that geography datatype from the latitude and longitude points of crimes and bike racks.

Now I am trying to do something like this:

update Crimes set distanceToNearestBikeRack = crimeGeo.STDistance((select geo from bike_racks))

Unfortunately, this would cause the subquery to return many rows and that throws:

Msg 512, Level 16, State 1, Line 1 Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression. The statement has been terminated.

How can I find the nearest point from a multitude of points?

Was it helpful?

Solution

Normally you would use MIN to find a minimum of a set of values, but it appears it doesn't play well with spatial methods. This works, however:

UPDATE
   Crimes
SET
    distanceToNearestBikeRack = 
        (SELECT TOP 1 crimeGeo.STDistance(BR.geo)
        FROM bike_racks BR
        ORDER BY crimeGeo.STDistance(BR.geo) ASC)

Of course, with five million crimes and five thousand bike racks, this could take... a while. So you'll probably want to look into batching up the updates.

OTHER TIPS

You obviously need to do this calculation once for each crime location, and the subquery will need information about the crime location in order to find the nearest bike rack.

I have not looked into SQL 2012, but in the old days one would have used the Haversine formular, which is actually pretty fast. See the answer to Geolocation and Haversine formula, or my answer to How to calculate distance from a GPX file? from a few years ago for a UDF implementation.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top