Question

I have a data set that has two level breakdown; trying to summarize over the top level with a window function (a cumulative sum) but window functions does not group output into a single row.

The data looks like this:

date | top level a | a1 | field to summarize
date | top level a | a2 | field to summarize
date | top level b | b1 | field to summarize
date | top level b | b2 | field to summarize
date | top level b | b3 | field to summarize

I am doing:

SUM(field_to_summarize) OVER (partition by top_level order by date) AS CumulativeSum

This returns the same CumulativeSum per top level on every row as you see above. How do I do the aggregate by window but return just one row per top level? Like so:

date | top level a | Cumulative Sum for a
date | top level b | Cumulative Sum for b

Was it helpful?

Solution

Is this what you are looking for? Note the nested sum function inside sum()over()

create table myData(
date date
,top_level text
,second_level text
,value int
);

insert into myData values('2012-12-01','A', 'A1', 12);
insert into myData values('2012-12-01','A', 'A2', 19);
insert into myData values('2012-12-01','B', 'B1', 10);
insert into myData values('2012-12-01','B', 'B2', 15);

insert into myData values('2013-01-01','A', 'A1', 10);
insert into myData values('2013-01-01','A', 'A2', 7);
insert into myData values('2013-01-01','B', 'B1', 20);
insert into myData values('2013-01-01','B', 'B2', 35);


select 
date
,top_level
,sum(value) as value_total
,sum(sum(value))over(partition by top_level order by date) as value_running_total
from myData
group by date, top_level
order by top_level, date;

returns:

date; top_level; value_total; value_running_total
"2012-12-01";"A";31;31
"2013-01-01";"A";17;48
"2012-12-01";"B";25;25
"2013-01-01";"B";55;80
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top