سؤال

First off, here's my setup:

  • Python 2.7.6
  • Django 1.6
  • PostgreSQL 9.3.1
  • PostGIS 2.1.1

I have loaded up the Natural Earth countries and states datasets into PostGIS. Here's the Django model I'm using:

class Location(models.Model):
    name = models.CharField(max_length=255)
    imported_from = models.CharField(max_length=255)
    admin_level = models.CharField(max_length=255, blank=True)
    geometry = models.MultiPolygonField(blank=True, default=None, null=True)
    objects = models.GeoManager() #override the default manager with a GeoManager instance
    parent = models.ForeignKey('self', blank=True, default=None, null=True)

    def __unicode__(self):
            return self.name

    @staticmethod
    def get_countries(continent):
            return Location.objects.filter(parent=continent).order_by('name')

    @staticmethod
    def get_continents():
            return Location.objects.filter(parent=None).order_by('name')

    @staticmethod
    def get_states(country):
            return Location.objects.filter(parent=country).order_by('name')

This should be fairly self-explanatory, but an important thing to note is that this allows for a hierarchy of locations (e.g., Texas is in the U.S., which is in North America).

I need to get a set of locations that touch some other location. Here's how I'm doing this in the view:

touching_locations = {x for x in Location.objects.filter(geometry__touches=Location.objects.get(name='LOCATION_NAME').geometry).values_list('name', flat=True)}

This query works just fine for some locations (like Angola), but it's abysmally slow for some others (like the U.S.). I do have a GiST index created on geometry, but I'm not seeing the speed I expected. When I run the query for the U.S., django-debug-toolbar tells me that the query (https://gist.github.com/gfairchild/7476754) takes a whopping 106260.14 ms to complete, which is obviously unacceptable.

The entire locations table only has 4865 entries, so what's going on? Am I issuing this query right?

هل كانت مفيدة؟

المحلول

Yes, I'd expect it to be slow since the geometry that you linked to is massive:

[[ MULTIPOLYGON - 346 elements, 36054 pts ]]

A GiST index won't help either, since the CPU burning away to determine if the point is within this specific detailed multipolygon, rather then determining if it is within a bounding box (bbox) of thousands of rows of data. Note, here is the geometry and a bbox that overlaps a few continents:

enter image description here

Since the bbox warps over the date-line with +ve longitudes, it covers Europe. This means if you are querying a point in Europe, it will intersect the bbox for the United States, and PostGIS may need to check this large geometry to see if it is touches the polygon. See R-Tree to get an understanding of how the GiST index works, and why smaller boxes with fewer overlaps query fastest.


The best solution is to use smaller geometries, which inherently have fewer elements/points and will normally have smaller bboxes to help the GiST index. The "states" dataset you mentioned is more ideal, since they have limited geographic extents and probably fewer vertices (helps detailed spatial relation query). Besides Natural Earth, a really good dataset for determining administrative boundaries world wide is: http://www.gadm.org

Both these options will move the boundaries and change what "touches" means, since the boundaries are different and this make a huge difference for "touches". Note that there are several other operators that are more common and mean different things, such as "intersects", "contains", and "within"; see https://en.wikipedia.org/wiki/DE-9IM

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top