Question

suppose I have this model:

class PhotoAlbum(models.Model):
    title = models.CharField(max_length=128)
    author = models.CharField(max_length=128)

class Photo(models.Model):
    album = models.ForeignKey('PhotoAlbum')
    format = models.IntegerField()

Now if I want to look at a subset of photos in a subset of albums efficiently. I do it something like this:

someAlbums = PhotoAlbum.objects.filter(author="Davey Jones").prefetch_related("photo_set")
for a in someAlbums:
    somePhotos = a.photo_set.all()

This does only two queries, which is what I expect (one to get the albums, and then one like `SELECT * IN photos WHERE photoalbum_id IN ().

Everything is great.

But if I do this:

someAlbums = PhotoAlbum.objects.filter(author="Davey Jones").prefetch_related("photo_set")
for a in someAlbums:
    somePhotos = a.photo_set.filter(format=1)

Then it does a ton of queries with WHERE format = 1! Am I doing something wrong or is django not smart enough to realise it has already fetched all the photos and can filter them in python? I swear I read somewhere in the documentation that it is supposed to do that...

Was it helpful?

Solution

In Django 1.6 and earlier, it is not possible to avoid the extra queries. The prefetch_related call effectively caches the results of a.photoset.all() for every album in the queryset. However, a.photoset.filter(format=1) is a different queryset, so you will generate an extra query for every album.

This is explained in the prefetch_related docs. The filter(format=1) is equivalent to filter(spicy=True).

Note that you could reduce the number or queries by filtering the photos in python instead:

someAlbums = PhotoAlbum.objects.filter(author="Davey Jones").prefetch_related("photo_set")
for a in someAlbums:
    somePhotos = [p for p in a.photo_set.all() if p.format == 1]

In Django 1.7, there is a Prefetch() object that allows you to control the behaviour of prefetch_related.

from django.db.models import Prefetch

someAlbums = PhotoAlbum.objects.filter(author="Davey Jones").prefetch_related(
    Prefetch(
        "photo_set",
        queryset=Photo.objects.filter(format=1),
        to_attr="some_photos"
    )
)
for a in someAlbums:
    somePhotos = a.some_photos

For more examples of how to use the Prefetch object, see the prefetch_related docs.

OTHER TIPS

From the docs:

...as always with QuerySets, any subsequent chained methods which imply a different database query will ignore previously cached results, and retrieve data using a fresh database query. So, if you write the following:

pizzas = Pizza.objects.prefetch_related('toppings') [list(pizza.toppings.filter(spicy=True)) for pizza in pizzas]

...then the fact that pizza.toppings.all() has been prefetched will not help you - in fact it hurts performance, since you have done a database query that you haven't used. So use this feature with caution!

In your case, "a.photo_set.filter(format=1)" is treated like a fresh query.

In addition, "photo_set" is a reverse lookup - implemented via a different manager altogether.

One can use select_related if you want to use it with filter()

results = Geography.objects.filter(state__pk = 1).select_related('country')
results.query

For more: https://docs.djangoproject.com/en/3.1/ref/models/querysets/#select-related

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top