Question

I have a table structured like this:

id | date       | count
---+------------+-------
0  | 01-01-2000 | 46
1  | 01-01-2000 | 25
...
0  | 01-02-2000 | 235
1  | 01-02-2000 | 23
...

And so on. I'd like to create a table (solely for presentation purposes, there's clearly no reason to store the data like this) that shows averages for each id for each day of the week. As an example, I could show the sum for each day of the week with something like this:

id | sun | mon | tues | wed | thur | fri | sat |
---+-----+-----+------+-----+------+-----+-----+
0  | 146 | 13  | 51   | 123 | ...
1  | 225 | 245 | 2367 | 25  | ...
...
0  | 235 | 246 | 25   | ....
1  | 23  | .....
...

CREATE TABLE dailySum AS
SELECT id,
       SUM(CASE WHEN CAST(strftime('%w', date) AS INTEGER) = 0 THEN count ELSE 0 END) as sun,
       SUM(CASE WHEN CAST(strftime('%w', date) AS INTEGER) = 1 THEN count ELSE 0 END) as mon,
       SUM(CASE WHEN CAST(strftime('%w', date) AS INTEGER) = 2 THEN count ELSE 0 END) as tues,
       SUM(CASE WHEN CAST(strftime('%w', date) AS INTEGER) = 3 THEN count ELSE 0 END) as wed,
       SUM(CASE WHEN CAST(strftime('%w', date) AS INTEGER) = 4 THEN count ELSE 0 END) as thur,
       SUM(CASE WHEN CAST(strftime('%w', date) AS INTEGER) = 5 THEN count ELSE 0 END) as fri,
       SUM(CASE WHEN CAST(strftime('%w', date) AS INTEGER) = 6 THEN count ELSE 0 END) as sat
FROM tableOfImportantThings
GROUP BY id;

However, if I tried the same thing with averages, then I'd be including all of those zeroes in my calculation of the average, which would clearly lower it dramatically. I suppose I could get a count of distinct dates for each day of the week and divide by that count in a later query, but solutions like that seem overly complicated. I'm certain I'm missing something obvious. Any suggestions?

Was it helpful?

Solution

According to SQLITE documentation:

The avg() function returns the average value of all non-NULL X within a group. String and BLOB values that do not look like numbers are interpreted as 0. The result of avg() is always a floating point value as long as at there is at least one non-NULL input even if all inputs are integers. The result of avg() is NULL if and only if there are no non-NULL inputs.

Therefore, if you replace the ELSE condition to NULL it should theoretically produce the correct result:

SELECT id,
       AVG(CASE WHEN CAST(strftime('%w', date) AS INTEGER) = 0 THEN count ELSE NULL END) as sun,
       AVG(CASE WHEN CAST(strftime('%w', date) AS INTEGER) = 1 THEN count ELSE NULL END) as mon,
       AVG(CASE WHEN CAST(strftime('%w', date) AS INTEGER) = 2 THEN count ELSE NULL END) as tues,
       AVG(CASE WHEN CAST(strftime('%w', date) AS INTEGER) = 3 THEN count ELSE NULL END) as wed,
       AVG(CASE WHEN CAST(strftime('%w', date) AS INTEGER) = 4 THEN count ELSE NULL END) as thur,
       AVG(CASE WHEN CAST(strftime('%w', date) AS INTEGER) = 5 THEN count ELSE NULL END) as fri,
       AVG(CASE WHEN CAST(strftime('%w', date) AS INTEGER) = 6 THEN count ELSE NULL END) as sat
FROM tableOfImportantThings
GROUP BY id;

Update: Good point, CL. Based on your comment the query can be simplified like this:

SELECT id,
       AVG(CASE WHEN CAST(strftime('%w', date) AS INTEGER) = 0 THEN count END) as sun,
       AVG(CASE WHEN CAST(strftime('%w', date) AS INTEGER) = 1 THEN count END) as mon,
       AVG(CASE WHEN CAST(strftime('%w', date) AS INTEGER) = 2 THEN count END) as tues,
       AVG(CASE WHEN CAST(strftime('%w', date) AS INTEGER) = 3 THEN count END) as wed,
       AVG(CASE WHEN CAST(strftime('%w', date) AS INTEGER) = 4 THEN count END) as thur,
       AVG(CASE WHEN CAST(strftime('%w', date) AS INTEGER) = 5 THEN count END) as fri,
       AVG(CASE WHEN CAST(strftime('%w', date) AS INTEGER) = 6 THEN count END) as sat
FROM tableOfImportantThings
GROUP BY id;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top