Question

I have a Django app that tracks publications. Publications have a M2M relationship to authors. Using MySQL.

Simple.

class Publication(models.Model):

    slug = models.SlugField(unique=True, max_length=128)
    author = models.ManyToManyField(Author, blank=True, null=True, through='Authorship')
    title = models.CharField(max_length=128)

    def __unicode__(self):
        return unicode(self.title)

I have a ListView to show them:

class PubList(ListView):
    model = Publication

Most of these are research papers, with several authors. On my template I want to show a list of authors. So I did something like this:

{% for obj in publication_list %}
    <tr>
        <td><a href="{{ obj.get_absolute_url }}">{{ obj.title }}</a></td>
        <td>
            {% for a in obj.authorship_set.all %}
                {{ a.author.last_name }}, {{ a.author.first_name }}
                {% if not forloop.last %}; {% endif %}
            {% endfor %}
        </td>       
    </tr>    
{% endfor %}

Well, you might guess what my issue is. As the number of Publications grows, the DB calls skyrocket. 119 publications is 500+ queries.

I solved it like so: In my PubList(ListView) I override get_context_data and set the output of this function to context['authors']:

def get_authors_by_pub():

    from django.db import connection
    sql = """SELECT p.id,
                (
                    SELECT GROUP_CONCAT(CONCAT(a.last_name, ', ', a.first_name) SEPARATOR '; ')
                    FROM publication_authorship ap
                    LEFT JOIN publication_author a ON a.id = ap.author_id
                    WHERE ap.publication_id = p.id
                )
            FROM publication_publication p"""

    cursor = connection.cursor()
    cursor.execute(sql)
    rows = cursor.fetchall() or ()
    authors = {}
    for r in rows:
        if r[1]:
            authors[r[0]] = r[1]

    return authors

Now I have an authors Dictionary like: {1: 'Tesla, Nikola; Clarke, Aurthur; Hooper, Grace', 2: 'Hopper, Grace; Simpson, Marge'}

Then, on the template, since I can't access a dict by key, I loop through authors to find the ones with the key that is the publication.id:

<td>
    {% for key, value in authors.items %}
        {% if key == obj.id %}
            {{ value }}
        {% endif %}
    {% endfor %}
</td> 

This works, just 2 queries. Even though the authors query is brutal, with nested SELECTs, it's orders of magnitude faster than before.

But I'm wondering if there's a better way. I feel kind of icky looping through the whole dict for each publication on the template. I would love to be able to go authors[obj.id] on the template.

What do you think?

Était-ce utile?

La solution

Django covers related queries and lazy loading quite extensively in it's Documentation...Why would you code all of this when django offers:

Publication.objects.prefetch_related('authors').all()

https://docs.djangoproject.com/en/1.6/topics/db/queries/#related-objects https://docs.djangoproject.com/en/1.6/ref/models/querysets/#prefetch-related

You can use the above queryset inside your ListView:

class PublList(ListView):
    queryset = Publication.objects.prefetch_related('authors')
Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top