Question

I have such a Book model:

class Book(models.Model):
    authors = models.ManyToManyField(Author, ...)
    ...

In short:

I'd like to retrieve the books whose authors are strictly equal to a given set of authors. I'm not sure if there is a single query that does it, but any suggestions will be helpful.

In long:

Here is what I tried, (that failed to run getting an AttributeError)

# A sample set of authors
target_authors = set((author_1, author_2))

# To reduce the search space, 
# first retrieve those books with just 2 authors.
candidate_books = Book.objects.annotate(c=Count('authors')).filter(c=len(target_authors))

final_books = QuerySet()
for author in target_authors:
    temp_books = candidate_books.filter(authors__in=[author])
    final_books = final_books and temp_books

... and here is what I got:

AttributeError: 'NoneType' object has no attribute '_meta'

In general, how should I query a model with the constraint that its ManyToMany field contains a set of given objects as in my case?

ps: I found some relevant SO questions but couldn't get a clear answer. Any good pointer will be helpful as well. Thanks.

Was it helpful?

Solution

Similar to @goliney's approach, I found a solution. However, I think the efficiency could be improved.

# A sample set of authors
target_authors = set((author_1, author_2))

# To reduce the search space, first retrieve those books with just 2 authors.
candidate_books = Book.objects.annotate(c=Count('authors')).filter(c=len(target_authors))

# In each iteration, we filter out those books which don't contain one of the 
# required authors - the instance on the iteration.
for author in target_authors:
    candidate_books = candidate_books.filter(authors=author)

final_books = candidate_books

OTHER TIPS

You can use complex lookups with Q objects

from django.db.models import Q
...
target_authors = set((author_1, author_2))
q = Q()
for author in target_authors:
    q &= Q(authors=author)
Books.objects.annotate(c=Count('authors')).filter(c=len(target_authors)).filter(q)

Q() & Q() is not equal to .filter().filter(). Their raw SQLs are different where by using Q with &, its SQL just add a condition like WHERE "book"."author" = "author_1" and "book"."author" = "author_2". it should return empty result.

The only solution is just by chaining filter to form a SQL with inner join on same table: ... ON ("author"."id" = "author_book"."author_id") INNER JOIN "author_book" T4 ON ("author"."id" = T4."author_id") WHERE ("author_book"."author_id" = "author_1" AND T4."author_id" = "author_1")

I came across the same problem and came to the same conclusion as iuysal, untill i had to do a medium sized search (with 1000 records with 150 filters my request would time out).

In my particular case the search would result in no records since the chance that a single record will align with ALL 150 filters is very rare, you can get around the performance issues by verifying that there are records in the QuerySet before applying more filters to save time.

# In each iteration, we filter out those books which don't contain one of the 
# required authors - the instance on the iteration.
for author in target_authors:
   if candidate_books.count() > 0:
      candidate_books = candidate_books.filter(authors=author)

For some reason Django applies filters to empty QuerySets. But if optimization is to be applied correctly however, using a prepared QuerySet and correctly applied indexes are necessary.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top