Вопрос

I have a Django app that tracks publications. Publications have a M2M relationship to authors. Using MySQL.

Simple.

class Publication(models.Model):

    slug = models.SlugField(unique=True, max_length=128)
    author = models.ManyToManyField(Author, blank=True, null=True, through='Authorship')
    title = models.CharField(max_length=128)

    def __unicode__(self):
        return unicode(self.title)

I have a ListView to show them:

class PubList(ListView):
    model = Publication

Most of these are research papers, with several authors. On my template I want to show a list of authors. So I did something like this:

{% for obj in publication_list %}
    <tr>
        <td><a href="{{ obj.get_absolute_url }}">{{ obj.title }}</a></td>
        <td>
            {% for a in obj.authorship_set.all %}
                {{ a.author.last_name }}, {{ a.author.first_name }}
                {% if not forloop.last %}; {% endif %}
            {% endfor %}
        </td>       
    </tr>    
{% endfor %}

Well, you might guess what my issue is. As the number of Publications grows, the DB calls skyrocket. 119 publications is 500+ queries.

I solved it like so: In my PubList(ListView) I override get_context_data and set the output of this function to context['authors']:

def get_authors_by_pub():

    from django.db import connection
    sql = """SELECT p.id,
                (
                    SELECT GROUP_CONCAT(CONCAT(a.last_name, ', ', a.first_name) SEPARATOR '; ')
                    FROM publication_authorship ap
                    LEFT JOIN publication_author a ON a.id = ap.author_id
                    WHERE ap.publication_id = p.id
                )
            FROM publication_publication p"""

    cursor = connection.cursor()
    cursor.execute(sql)
    rows = cursor.fetchall() or ()
    authors = {}
    for r in rows:
        if r[1]:
            authors[r[0]] = r[1]

    return authors

Now I have an authors Dictionary like: {1: 'Tesla, Nikola; Clarke, Aurthur; Hooper, Grace', 2: 'Hopper, Grace; Simpson, Marge'}

Then, on the template, since I can't access a dict by key, I loop through authors to find the ones with the key that is the publication.id:

<td>
    {% for key, value in authors.items %}
        {% if key == obj.id %}
            {{ value }}
        {% endif %}
    {% endfor %}
</td> 

This works, just 2 queries. Even though the authors query is brutal, with nested SELECTs, it's orders of magnitude faster than before.

But I'm wondering if there's a better way. I feel kind of icky looping through the whole dict for each publication on the template. I would love to be able to go authors[obj.id] on the template.

What do you think?

Это было полезно?

Решение

Django covers related queries and lazy loading quite extensively in it's Documentation...Why would you code all of this when django offers:

Publication.objects.prefetch_related('authors').all()

https://docs.djangoproject.com/en/1.6/topics/db/queries/#related-objects https://docs.djangoproject.com/en/1.6/ref/models/querysets/#prefetch-related

You can use the above queryset inside your ListView:

class PublList(ListView):
    queryset = Publication.objects.prefetch_related('authors')
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top