Question

(PostgreSQL 9.3) I have a table "events" with millions of complex events, stored as received by a device. For example purposes:

+-----------+-------+
| Timestamp | Event |
+-----------+-------+
| 1         | A     |
| 2         | A     |
| 2         | B     |
| 3         | B     |
| 10        | A     |
| 11        | A     |
| 11        | 0     |
| 11        | C     |
| 12        | A     |
+-----------+-------+

In this case I have four different kinds of events: A, B, C and 0. What I want to do is index them such that I can have start/stop timestamps for each event. The stop conditions are: Event is no longer being reported at a given timestamp OR a "0" even came in, clearing all of them. Final output:

+------+----+-------+
| From | To | Event |
+------+----+-------+
| 1    | 3  | A     |
| 2    | 10 | B     |
| 10   | 11 | A     |
| 11   | 11 | C     |
| 12   |    | A     |
+------+----+-------+

I this case, A was raised at 1, and cleared at 3 because it was no longer being reported at that moment. B was raised at 2, and cleared at 10 for similar reason. A was raised again at 10 and cleared at 11 with the 0 event (despite being reported at that time too!). C was raised at 11 AND cleared at the same time (some ordering will need to be done to handle 0 at same timestamp). Lastly, A was raised again at 12 and is currently active so it gets a NULL end timestamp.

I do have something that works but it is CTE-heavy and as such, doesn't scale well for millions of records. I have been experimenting with LATERAL (with great results) and I am open to any 9.3-specific recommendations. Also the "event" itself has greatly been simplified for this question, in fact it is a complex group of columns. It's possible Window-functions could apply here too.

Was it helpful?

Solution

Thinking out of the box here, why do you not maintain the summary table with a trigger?

here is an example for your case (omitted FKs etc.)

create table event_type (
    event_type_id serial,
    event_name varchar(255)
);

create table event (
    event_time timestamp(0),
    event_type_id int
);

create table event_summary (
    event_summary_id serial,
    sum_from timestamp(0),
    sum_to timestamp(0),
    event_type_id int
);

create language plpgsql;

create or replace function event_insertion() returns trigger as $$
    declare
        var_event_summary_id integer;
    begin
        -- find out if event was fired during the previous second
        select
            event_summary_id
        into
            var_event_summary_id
        from
            event_summary s
        where
            new.event_type_id = s.event_type_id
            and sum_to >= new.event_time - interval '1 seconds';

        if found then
            --update existing summary to include this timestamp
            update event_summary set sum_to = new.event_time where event_summary_id = var_event_summary_id;
        else
            --create new summary for just this timestamp
            insert into event_summary(sum_from,sum_to,event_type_id) values (new.event_time,new.event_time,new.event_type_id);
        end if;

    return null;
    end;
$$ language plpgsql;

create trigger event_insertion after insert on event
    for each row execute procedure event_insertion();

-- some initial data
insert into event_type(event_name) values ('a');
insert into event_type(event_name) values ('b');
insert into event_type(event_name) values ('c');
insert into event_type(event_name) values ('0');

-- fire the events
insert into event(event_time,event_type_id) values (now(),(select event_type_id from event_type where event_name = 'a'));
select pg_sleep(1);
insert into event(event_time,event_type_id) values (now(),(select event_type_id from event_type where event_name = 'a'));
insert into event(event_time,event_type_id) values (now(),(select event_type_id from event_type where event_name = 'b'));
select pg_sleep(1);
insert into event(event_time,event_type_id) values (now(),(select event_type_id from event_type where event_name = 'b'));
select pg_sleep(7);
insert into event(event_time,event_type_id) values (now(),(select event_type_id from event_type where event_name = 'a'));
select pg_sleep(1);
insert into event(event_time,event_type_id) values (now(),(select event_type_id from event_type where event_name = 'a'));
insert into event(event_time,event_type_id) values (now(),(select event_type_id from event_type where event_name = '0'));
insert into event(event_time,event_type_id) values (now(),(select event_type_id from event_type where event_name = 'c'));
select pg_sleep(1);
insert into event(event_time,event_type_id) values (now(),(select event_type_id from event_type where event_name = 'a'));

-- query the summary table
select extract (seconds from s.sum_from), extract (seconds from s.sum_to), t.event_name from event_summary s inner join event_type t on (t.event_type_id = s.event_type_id);
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top