fetching single child row based on a max value using Django ORM
https://stackoverflow.com/questions/931730
Question
I have a model, "Market" that has a one-to-many relation to another model, "Contract":
class Market(models.Model):
name = ...
...
class Contract(models.Model):
name= ...
market = models.ForeignKey(Market, ...)
current_price = ...
I'd like to fetch Market objects along with the contract with the maximum price of each. This is how I'd do it via raw SQL:
SELECT M.id as market_id, M.name as market_name, C.name as contract_name, C.price
as price from pm_core_market M INNER JOIN
(SELECT market_id, id, name, MAX(current_price) as price
FROM pm_core_contract GROUP BY market_id) AS C
ON M.id = C.market_id
Is there a way to implement this without using SQL? If there is, which one should be preferred in terms of performance?
Solution
Django 1.1 (currently beta) adds aggregation support to the database API. Your query can be done like this:
from django.db.models import Max, F
Contract.objects.annotate(max_price=Max('market__contract__current_price')).filter(current_price=F('max_price')).select_related()
This generates the following SQL query:
SELECT contract.id, contract.name, contract.market_id, contract.current_price, MAX(T3.current_price) AS max_price, market.id, market.name
FROM contract LEFT OUTER JOIN market ON (contract.market_id = market.id) LEFT OUTER JOIN contract T3 ON (market.id = T3.market_id)
GROUP BY contract.id, contract.name, contract.market_id, contract.current_price, market.id, market.name
HAVING contract.current_price = MAX(T3.current_price)
The API uses an extra join instead of a subquery (like your query does). It is difficult to tell which query is faster, especially without knowing the database system. I suggest that you do some benchmarks and decide.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow