Extracting sum of time in a period

https://dba.stackexchange.com/questions/176068

07-10-2020
|

Question

I have a login table, which registers datahora_ini (system entry time) and datahora_fim (system exit time). Sample data:

CREATE TABLE login(cod_user,datahora_ini,datahora_fim)
AS
  VALUES
    ( 101::int, '2017-06-02 08:02:14'::timestamp, '2017-06-02 13:59:23'::timestamp ),
    ( 102,      '2017-06-02 08:03:38',            '2017-06-02 16:13:53' ),
    ( 103,      '2017-06-02 08:05:50',            '2017-06-02 11:44:39' ),
    ( 104,      '2017-06-02 08:06:03',            '2017-06-02 14:01:33' ),
    ( 105,      '2017-06-02 08:06:35',            '2017-06-02 14:00:23' ),
    ( 106,      '2017-06-02 08:10:05',            '2017-06-02 08:31:26' )
  ;

I need to set up a query where I can add the time of all the users that were logged into the system at that time! Fake example of the desired format:

time    time_logged ( Seconds )    
07:00   1005
08:00   4980
09:00   8193
10:00   9259
11:00   25692
12:00   18823

My current query :

SELECT 
 TO_CHAR(al.datahora_fim,'HH24:00') as Data_ini, 
 ROUND(SUM(CASE WHEN al.datahora_fim IS NULL THEN EXTRACT(EPOCH FROM (CURRENT_TIMESTAMP - al.datahora_ini)) ELSE EXTRACT(EPOCH FROM (al.datahora_fim - al.datahora_ini)) END))as tempo_logado  

FROM callcenter.agente_login al  
LEFT JOIN callcenter.agente a ON ( a.id_agente = al.id_agente )  
LEFT JOIN crm.usuarios u ON ( a.crm_cod_usuario = u.cod_usuario )  
WHERE ( al.datahora_fim BETWEEN '2017-06-02 00:00:00' AND '2017-06-02 23:59:59' )  

GROUP BY Data_ini  
ORDER BY Data_ini ASC;

Solution 2

SELECT i_start, sum(LEAST(i_end, datahora_fim) - GREATEST(i_start, datahora_ini)) AS sum_times
FROM  (
   SELECT t AS i_start, t + interval '1 hour' AS i_end
   FROM  (SELECT generate_series(date_trunc('hour', min(datahora_ini))
                                                  , max(datahora_fim), interval '1 hour') AS t
          FROM callcenter.agente_login) sub
   ) t
JOIN   callcenter.agente_login ON datahora_ini < i_end
          AND datahora_fim   > i_start
WHERE  datahora_fim > datahora_ini
AND ( datahora_ini BETWEEN '2017-06-20 00:00:00' AND '2017-06-20 23:59:59' ) 
GROUP  BY 1
ORDER  BY 1;

OTHER TIPS

Using `tsrange`

I like tsrange. It's certainly not the only way to do this, but it is not error prone and it's easy to read and write.

SELECT
  grp.range,
  sum(
    EXTRACT(epoch FROM least(upper(grp.range),datahora_fim))
    - EXTRACT(epoch FROM greatest(lower(grp.range),datahora_ini))
  )
FROM (
  SELECT
    date_trunc('hour', min(datahora_ini)),
    date_trunc('hour', max(datahora_fim))
  FROM login
) AS bounds(min,max)
CROSS JOIN LATERAL generate_series(min, max, '1 hour') AS gs(start)
CROSS JOIN LATERAL tsrange(gs.start, gs.start + '1 hour') AS grp(range)
JOIN login ON grp.range && tsrange(datahora_ini,datahora_fim)
GROUP BY range
ORDER BY range;
                     range                     |  sum  
-----------------------------------------------+-------
 ["2017-06-02 08:00:00","2017-06-02 09:00:00") | 17821
 ["2017-06-02 09:00:00","2017-06-02 10:00:00") | 18000
 ["2017-06-02 10:00:00","2017-06-02 11:00:00") | 18000
 ["2017-06-02 11:00:00","2017-06-02 12:00:00") | 17079
 ["2017-06-02 12:00:00","2017-06-02 13:00:00") | 14400
 ["2017-06-02 13:00:00","2017-06-02 14:00:00") | 14363
 ["2017-06-02 14:00:00","2017-06-02 15:00:00") |  3716
 ["2017-06-02 15:00:00","2017-06-02 16:00:00") |  3600
 ["2017-06-02 16:00:00","2017-06-02 17:00:00") |   833
(9 rows)

The first part generates the ranges for the data

SELECT bounds.*, grp.*
FROM (
  SELECT
    date_trunc('hour', min(datahora_ini)),
    date_trunc('hour', max(datahora_fim))
  FROM login
) AS bounds(min,max)
CROSS JOIN LATERAL generate_series(min, max, '1 hour') AS gs(start)
CROSS JOIN LATERAL tsrange(gs.start, gs.start + '1 hour') AS grp(range)
ORDER BY range;
         min         |         max         |                     range                     
---------------------+---------------------+-----------------------------------------------
 2017-06-02 08:00:00 | 2017-06-02 17:00:00 | ["2017-06-02 08:00:00","2017-06-02 09:00:00")
 2017-06-02 08:00:00 | 2017-06-02 17:00:00 | ["2017-06-02 09:00:00","2017-06-02 10:00:00")
 2017-06-02 08:00:00 | 2017-06-02 17:00:00 | ["2017-06-02 10:00:00","2017-06-02 11:00:00")
 2017-06-02 08:00:00 | 2017-06-02 17:00:00 | ["2017-06-02 11:00:00","2017-06-02 12:00:00")
 2017-06-02 08:00:00 | 2017-06-02 17:00:00 | ["2017-06-02 12:00:00","2017-06-02 13:00:00")
 2017-06-02 08:00:00 | 2017-06-02 17:00:00 | ["2017-06-02 13:00:00","2017-06-02 14:00:00")
 2017-06-02 08:00:00 | 2017-06-02 17:00:00 | ["2017-06-02 14:00:00","2017-06-02 15:00:00")
 2017-06-02 08:00:00 | 2017-06-02 17:00:00 | ["2017-06-02 15:00:00","2017-06-02 16:00:00")
 2017-06-02 08:00:00 | 2017-06-02 17:00:00 | ["2017-06-02 16:00:00","2017-06-02 17:00:00")
(9 rows)

The second part joins them back to the original data, and

Pulls the seconds since epoch on the least of the high-point on the range and the datahora_fim. The ceiling for the range is the range itself.
Pulls the seconds since epoch on the greatest of the low point on the range, and datahora_ini. The floor for the range is the range itself.
Subtracts the two to get the difference in seconds
Sums it up.

That looks like this,

sum(
  EXTRACT(epoch FROM least(upper(grp.range),datahora_fim))
  - EXTRACT(epoch FROM greatest(lower(grp.range),datahora_ini))
)

This method can use a functional index on tsrange(datahora_ini,datahora_fim).

Your original query just shows the lower part of the tsrange, if you prefer that just use lower(grp.range)

Licensed under: CC-BY-SA with attribution

Not affiliated with dba.stackexchange

Extracting sum of time in a period

Using tsrange

Using `tsrange`