Question

I have a Django model that looks something like this:

class Response(models.Model):
    transcript = models.TextField(null=True)

class Coding(models.Model):
    qid = models.CharField(max_length = 30)
    value = models.CharField(max_length = 200)
    response = models.ForeignKey(Response)
    coder = models.ForeignKey(User)

For each Response object, there are two coding objects with qid = "risk", one for coder 3 and one for coder 4. What I would like to be able to do is get a list of all Response objects for which the difference in value between coder 3 and coder 4 is greater than 1. The value field stores numbers 1-7.

I realize in hindsight that setting up value as a CharField may have been a mistake, but hopefully I can get around that.

I believe something like the following SQL would do what I'm looking for, but I'd rather do this with the ORM

SELECT UNIQUE c1.response_id FROM coding c1, coding c2
WHERE c1.coder_id = 3 AND 
      c2.coder_id = 4 AND
      c1.qid = "risk" AND 
      c2.qid = "risk" AND
      c1.response_id = c2.response_id AND
      c1.value - c2.value > 1
Was it helpful?

Solution

from django.db.models import F
qset = Coding.objects.filter(response__coding__value__gt=F('value') + 1,
                             qid='risk',  coder=4
                    ).extra(where=['T3.qid = %s', 'T3.coder_id = %s'],
                            params=['risk', 3])
responses = [c.response for c in qset.select_related('response')]

When you join to a table already in the query, the ORM will assign the second one an alias, in this case T3, which you can using in parameters to extra(). To find out what the alias is you can drop into the shell and print qset.query.

See Django documentation on F objects and extra

Update: It seems you actually don't have to use extra(), or figure out what alias django uses, because every time you refer to response__coding in your lookups, django will use the alias created initially. Here's one way to look for differences in either direction:

from django.db.models import Q, F
gt = Q(response__coding__value__gt=F('value') + 1)
lt = Q(response__coding__value__lt=F('value') - 1)
match = Q(response__coding__qid='risk', response__coding__coder=4)
qset = Coding.objects.filter(match & (gt | lt), qid='risk', coder=3)
responses = [c.response for c in qset.select_related('response')]

See Django documentation on Q objects

BTW, If you are going to want both Coding instances, you have an N + 1 queries problem here, because django's select_related() won't get reverse FK relationships. But since you have the data in the query already, you could retrieve the required information using the T3 alias as described above and extra(select={'other_value':'T3.value'}). The value data from the corresponding Coding record would be accessible as an attribute on the retrieved Coding instance, i.e. as c.other_value.

Incidentally, your question is general enough, but it looks like you have an entity-attribute-value schema, which in an RDB scenario is generally considered an anti-pattern. You might be better off long-term (and this query would be simpler) with a risk field:

class Coding(models.Model):
    response = models.ForeignKey(Response)
    coder = models.ForeignKey(User)
    risk = models.IntegerField()
    # other fields for other qid 'attribute' names...
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top