Best practice for this changes quite quickly, so I'll answer with what I think is most up-to-date as of 2020-01-18.
With GeoDjango
Using geography=True
with GeoDjango makes this much easier. It means everything is stored in lng/lat, but distance calculations are done in meters on the surface of the sphere. See the docs
from django.db import models
from django.contrib.gis.db.models import PointField
class Vacancy(models.Model):
location = PointField(srid=4326, geography=True, blank=True, null=True)
Django 3.0
If you have Django 3.0, you can sort your whole table using the following query. It uses postgis' <->
operator, which means sorting will use the spacial index and the annotated distance will be exact (for Postgres 9.5+). Note that "sorting by distance" implicitly requires a distance from something. The first argument to Point
is the longitude and the second is latitude (the opposite of the normal convention).
from django.contrib.gis.db.models.functions import GeometryDistance
from django.contrib.gis.geos import Point
ref_location = Point(140.0, 40.0, srid=4326)
Vacancy.objects.order_by(GeometryDistance("location", ref_location))
If you want to use the distance from the reference point in any way, you'll need to annotate it:
Vacancy.objects.annotate(distance=GeometryDistance("location", ref_location))\
.order_by("distance")
If you have a lot of results, calculating the exact distance for every entry will still be slow. You should reduce the number of results with one of the following:
Limit the number of results with queryset slicing
The <->
operator won't calculate exact distance for (most) results it won't return, so slicing or paginating the results is fast. To get the first 100 results:
Vacancy.objects.annotate(distance=GeometryDistance("location", ref_location))\
.order_by("distance")[:100]
Only get results within a certain distance with dwithin
If there is a maximum distance that you want results for, you should use dwithin
. The dwithin
django query uses ST_DWithin, which means it's very fast. Setting geography=True means this calculation is done in meters, not degrees. The final query for everything within 50km would be:
Vacancy.objects.filter(location__dwithin=(ref_location, 50000))\
.annotate(distance=GeometryDistance("location", ref_location))\
.order_by("distance")
This can speed up queries a bit even if you are slicing down to a few results.
The second argument to dwithin
also accepts django.contrib.gis.measure.D
objects, which it converts into meters, so instead of 50000
meters, you could just use D(km=50)
.
Filtering on distance
You can filter directly on the annotated distance
, but it will duplicate the <->
call and be a fair amount slower than dwithin
.
Vacancy.objects.annotate(distance=GeometryDistance("location", ref_location))\
.filter(distance__lte=50000)\
.order_by("distance")
Django 2.X
If you don't have Django 3.0, you can still sort your whole table using Distance
instead of GeometryDistance
, but it uses ST_Distance, which might be slow if it is done on every entry and there are a lot of entries. If that's the case, you can use dwithin
to narrow down the results.
Note that slicing will not be fast because Distance
needs to calculate the exact distance for everything in order to sort the results.
Without GeoDjango
If you don't have GeoDjango, you'll need a sql formula for calculating distance. The efficiency and correctness varies from answer to answer (especially around the poles/dateline), but in general it will be fairly slow.
One way to speed queries up is to index lat
and lng
and use mins/maxes for each before annotating the distance. The math is quite complicated because the bounding "box" isn't exactly a box. See here: How to calculate the bounding box for a given lat/lng location?