Extracting sum of time in a period
-
07-10-2020 - |
Question
I have a login table, which registers datahora_ini
(system entry time) and datahora_fim
(system exit time). Sample data:
CREATE TABLE login(cod_user,datahora_ini,datahora_fim)
AS
VALUES
( 101::int, '2017-06-02 08:02:14'::timestamp, '2017-06-02 13:59:23'::timestamp ),
( 102, '2017-06-02 08:03:38', '2017-06-02 16:13:53' ),
( 103, '2017-06-02 08:05:50', '2017-06-02 11:44:39' ),
( 104, '2017-06-02 08:06:03', '2017-06-02 14:01:33' ),
( 105, '2017-06-02 08:06:35', '2017-06-02 14:00:23' ),
( 106, '2017-06-02 08:10:05', '2017-06-02 08:31:26' )
;
I need to set up a query where I can add the time of all the users that were logged into the system at that time! Fake example of the desired format:
time time_logged ( Seconds )
07:00 1005
08:00 4980
09:00 8193
10:00 9259
11:00 25692
12:00 18823
My current query :
SELECT
TO_CHAR(al.datahora_fim,'HH24:00') as Data_ini,
ROUND(SUM(CASE WHEN al.datahora_fim IS NULL THEN EXTRACT(EPOCH FROM (CURRENT_TIMESTAMP - al.datahora_ini)) ELSE EXTRACT(EPOCH FROM (al.datahora_fim - al.datahora_ini)) END))as tempo_logado
FROM callcenter.agente_login al
LEFT JOIN callcenter.agente a ON ( a.id_agente = al.id_agente )
LEFT JOIN crm.usuarios u ON ( a.crm_cod_usuario = u.cod_usuario )
WHERE ( al.datahora_fim BETWEEN '2017-06-02 00:00:00' AND '2017-06-02 23:59:59' )
GROUP BY Data_ini
ORDER BY Data_ini ASC;
Solution 2
SELECT i_start, sum(LEAST(i_end, datahora_fim) - GREATEST(i_start, datahora_ini)) AS sum_times
FROM (
SELECT t AS i_start, t + interval '1 hour' AS i_end
FROM (SELECT generate_series(date_trunc('hour', min(datahora_ini))
, max(datahora_fim), interval '1 hour') AS t
FROM callcenter.agente_login) sub
) t
JOIN callcenter.agente_login ON datahora_ini < i_end
AND datahora_fim > i_start
WHERE datahora_fim > datahora_ini
AND ( datahora_ini BETWEEN '2017-06-20 00:00:00' AND '2017-06-20 23:59:59' )
GROUP BY 1
ORDER BY 1;
OTHER TIPS
Using tsrange
I like tsrange. It's certainly not the only way to do this, but it is not error prone and it's easy to read and write.
SELECT
grp.range,
sum(
EXTRACT(epoch FROM least(upper(grp.range),datahora_fim))
- EXTRACT(epoch FROM greatest(lower(grp.range),datahora_ini))
)
FROM (
SELECT
date_trunc('hour', min(datahora_ini)),
date_trunc('hour', max(datahora_fim))
FROM login
) AS bounds(min,max)
CROSS JOIN LATERAL generate_series(min, max, '1 hour') AS gs(start)
CROSS JOIN LATERAL tsrange(gs.start, gs.start + '1 hour') AS grp(range)
JOIN login ON grp.range && tsrange(datahora_ini,datahora_fim)
GROUP BY range
ORDER BY range;
range | sum
-----------------------------------------------+-------
["2017-06-02 08:00:00","2017-06-02 09:00:00") | 17821
["2017-06-02 09:00:00","2017-06-02 10:00:00") | 18000
["2017-06-02 10:00:00","2017-06-02 11:00:00") | 18000
["2017-06-02 11:00:00","2017-06-02 12:00:00") | 17079
["2017-06-02 12:00:00","2017-06-02 13:00:00") | 14400
["2017-06-02 13:00:00","2017-06-02 14:00:00") | 14363
["2017-06-02 14:00:00","2017-06-02 15:00:00") | 3716
["2017-06-02 15:00:00","2017-06-02 16:00:00") | 3600
["2017-06-02 16:00:00","2017-06-02 17:00:00") | 833
(9 rows)
The first part generates the ranges for the data
SELECT bounds.*, grp.*
FROM (
SELECT
date_trunc('hour', min(datahora_ini)),
date_trunc('hour', max(datahora_fim))
FROM login
) AS bounds(min,max)
CROSS JOIN LATERAL generate_series(min, max, '1 hour') AS gs(start)
CROSS JOIN LATERAL tsrange(gs.start, gs.start + '1 hour') AS grp(range)
ORDER BY range;
min | max | range
---------------------+---------------------+-----------------------------------------------
2017-06-02 08:00:00 | 2017-06-02 17:00:00 | ["2017-06-02 08:00:00","2017-06-02 09:00:00")
2017-06-02 08:00:00 | 2017-06-02 17:00:00 | ["2017-06-02 09:00:00","2017-06-02 10:00:00")
2017-06-02 08:00:00 | 2017-06-02 17:00:00 | ["2017-06-02 10:00:00","2017-06-02 11:00:00")
2017-06-02 08:00:00 | 2017-06-02 17:00:00 | ["2017-06-02 11:00:00","2017-06-02 12:00:00")
2017-06-02 08:00:00 | 2017-06-02 17:00:00 | ["2017-06-02 12:00:00","2017-06-02 13:00:00")
2017-06-02 08:00:00 | 2017-06-02 17:00:00 | ["2017-06-02 13:00:00","2017-06-02 14:00:00")
2017-06-02 08:00:00 | 2017-06-02 17:00:00 | ["2017-06-02 14:00:00","2017-06-02 15:00:00")
2017-06-02 08:00:00 | 2017-06-02 17:00:00 | ["2017-06-02 15:00:00","2017-06-02 16:00:00")
2017-06-02 08:00:00 | 2017-06-02 17:00:00 | ["2017-06-02 16:00:00","2017-06-02 17:00:00")
(9 rows)
The second part joins them back to the original data, and
- Pulls the seconds since epoch on the least of the high-point on the range and the
datahora_fim
. The ceiling for the range is the range itself. - Pulls the seconds since epoch on the greatest of the low point on the range, and
datahora_ini
. The floor for the range is the range itself. - Subtracts the two to get the difference in seconds
- Sums it up.
That looks like this,
sum(
EXTRACT(epoch FROM least(upper(grp.range),datahora_fim))
- EXTRACT(epoch FROM greatest(lower(grp.range),datahora_ini))
)
This method can use a functional index on tsrange(datahora_ini,datahora_fim)
.
Your original query just shows the lower part of the tsrange
, if you prefer that just use lower(grp.range)