I hope I've put the query together correctly below. I've simplified from something much, much bigger. But my question should be clear enough from the example.
In the innermost query, nest1
, let's say the results come back with three entries for invoice 123. The invoice was created (1), then two cash deposits were made (2, 3), the second of which (3), paid off the invoice. From that result set, I'm aggregating the sum. I'm also getting the paid_off_date, which only one of the rows will have.
I'm happy to be schooled on better ways to design this query, but what I don't understand is why I have to select the aggregates all the way out. Why, for example, when I select SUM(cash_deposit) AS 'cash_deposit'
in nest2
, do I also have to select SUM(cash_deposit)
in the outermost query? Once it is aggregated in a subquery, why isn't the aggregate passed along? And why do I have to keep grouping by id at each level?
SELECT id, SUM(cash_deposit), MAX(paid_off_date), MAX(job_phase), MAX(cust_id)
FROM
(
SELECT id AS id, SUM(cash_deposit) AS 'cash_deposit', MAX(paid_off_date) AS 'paid_off_date', MAX(job_phase) AS 'job_phase'
FROM
(
SELECT id, cash_deposit, paid_off_date
FROM invoice
GROUP BY id
) nest1
JOIN job j
ON nest1.id = j.id
GROUP BY id
) nest2
JOIN customer c
ON c.invoice = nest2.id
GROUP BY id
Clarification
Thanks for any posts with improved versions of the query. I'm working with an existing query that produces unexpected results when I start trying to turn the nesting into joins. I'm going to keep playing with it, but I'd also really appreciate if someone could answer the question about aggregates and why they have to be repeated when nesting. In C#, I can sum up three numbers in one function and then pass the resulting value back. I don't have to keep summing the results. I need some help understanding how SQL subqueries function differently in that respect.
Possible Answer
One of my coworkers pointed out that in this case, It's because of my GROUP BY
clauses. Since cash_deposit
and job_phase
, for example, aren't in the GROUP BY
clause, they have to be aggregated.