Question

Using Postgres 9.3.2, I want to get a count of req_status grouped by req_time and customer_id and return a set of n rows for each customer_id, even when req_status count is zero.

req_time     req_id   customer_id     req_status
----------------------------------------------- 
2014-03-19    100        1            'FAILED'
2014-03-19    102        1            'FAILED'
2014-03-19    105        1            'OK'
2014-03-19    106        2            'FAILED'
2014-03-20    107        1            'OK'
2014-03-20    108        2            'FAILED'
2014-03-20    109        2            'OK'
2014-03-20    110        1            'OK'

Output

req_time  customer_id   req_status  count
-------------------------------------------
2014-03-19    1            'FAILED'   2
2014-03-19    1            'OK'       1
2014-03-19    2            'FAILED'   1
2014-03-19    2            'OK'       0
2014-03-20    1            'FAILED'   0
2014-03-20    1            'OK'       2
2014-03-20    2            'FAILED'   1
2014-03-20    2            'OK'       1

How can I achieve this?

Was it helpful?

Solution

To also see missing rows in the result, left join to a complete grid of possible rows. The grid is built from all possible combinations of (req_time, customer_id, req_status) with cross joins:

SELECT d.req_time, c.customer_id, s.req_status, count(t.req_time) AS ct
FROM  (
   SELECT generate_series (min(req_time), max(req_time), '1 day')::date
   FROM   tbl
   ) d(req_time)
CROSS  JOIN (SELECT DISTINCT customer_id FROM tbl)  c(customer_id)
CROSS  JOIN (VALUES ('FAILED'::text), ('OK'))       s(req_status)
LEFT   JOIN  tbl t USING (req_time, customer_id, req_status)
GROUP  BY 1,2,3
ORDER  BY 1,2,3;

Count on a column from the actual table, which will be 0 if no match is found (NULL values don't count).

Assuming req_time to be a date (not timestamp).

Similar answer here:
array_agg group by and null

OTHER TIPS

SQL Fiddle

select
    s.req_time, s.customer_id,
    s.req_status,
    count(t.req_status is not null or null) as "count"
from
    t
    right join (
        (
            select distinct customer_id, req_time
            from t
        ) q
        cross join
        (values ('FAILED'), ('OK')) s(req_status)
    ) s on
        t.req_status = s.req_status and
        t.customer_id = s.customer_id and
        t.req_time = s.req_time
group by 1, 2, 3
order by 1, 2, 3
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top