Question

I'm using the northwind database to refresh my SQL skills by creating some more or less complex queries. Unfortunately I could not find a solution for my last use case: "Get the sum of the five greatest orders for every category in year 1997."

The tables involved are:

Orders(OrderId, OrderDate)
Order Details(OrderId, ProductId, Quantity, UnitPrice)
Products(ProductId, CategoryId)
Categories(CategoryId, CategoryName)

I have tried the following query

SELECT c.CategoryName, SUM(
  (SELECT TOP 5 od2.UnitPrice*od2.Quantity 
   FROM [Order Details] od2, Products p2
   WHERE od2.ProductID = p2.ProductID
   AND c.CategoryID = p2.CategoryID
   ORDER BY 1 DESC))
FROM [Order Details] od, Products p, Categories c, Orders o 
WHERE od.ProductID = p. ProductID
AND p.CategoryID = c.CategoryID
AND od.OrderID = o.OrderID
AND YEAR(o.OrderDate) = 1997
GROUP BY c.CategoryName

Well... It turned out that subqueries are not allowed in aggregate functions. I've read other posts about this issue but could not find a solution for my specific use case. Hope you can help me out...

Was it helpful?

Solution

Subqueries are not generally allowed in aggregate functions. Instead, move the aggregate inside the subquery. In this case, you'll need an extra level of subquery because of the top 5:

SELECT c.CategoryName,
  (select sum(val)
   from (SELECT TOP 5 od2.UnitPrice*od2.Quantity as val
         FROM [Order Details] od2, Products p2
         WHERE od2.ProductID = p2.ProductID
         AND c.CategoryID = p2.CategoryID
         ORDER BY 1 DESC
        ) t
  )
FROM [Order Details] od, Products p, Categories c, Orders o 
WHERE od.ProductID = p. ProductID
AND p.CategoryID = c.CategoryID
AND od.OrderID = o.OrderID
AND YEAR(o.OrderDate) = 1997
GROUP BY c.CategoryName, c.CategoryId

OTHER TIPS

Use CTE with ROW_NUMBER ranking function instead of excessive subquery.

 ;WITH cte AS
 (
  SELECT c.CategoryName, od2.UnitPrice, od2.Quantity,
         ROW_NUMBER() OVER(PARTITION BY c.CategoryName ORDER BY od2.UnitPrice * od2.Quantity DESC) AS rn
  FROM [Order Details] od JOIN Products p ON od.ProductID = p.ProductID
                          JOIN Categories c ON p.CategoryID = c.CategoryID
                          JOIN Orders o ON od.OrderID = o.OrderID
  WHERE o.OrderDate >= DATEADD(YEAR, DATEDIFF(YEAR, 0, '19970101'), 0)
    AND o.OrderDate < DATEADD(YEAR, DATEDIFF(YEAR, 0, '19970101')+1, 0)
  )
  SELECT CategoryName, SUM(UnitPrice * Quantity) AS val
  FROM cte
  WHERE rn < 6
  GROUP BY CategoryName

Its definitely a sub query problem here is an excellent article on this (originally written for Access but the syntax is identical), also orderdate = 1997 will give order date for 1 jan 1997' -- you need datepart(year, orderdate) = 1997, once you have the (up to five) rows returned for each category you can then encapsulate the rows returned and agregate them

I ran into a very similar problem with an Access subquery where the records were sorted by date. When I used the "Last" aggregate function I found it passed through all of the subqueries and retrieved the last row of data from the Access table, and not the sorted query as intended. Although I could have rewritten the query to use the aggregate function within the first set of parenthesis (as was previously suggested) I found it easier to save query results as a table in the database sorted in the order I wanted and then use the "Last" aggregate function to retrieve the values I wanted. I'll run an update query in the future to keep results current. Not efficient but effective.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top