Question

I have 2 tables in PostgreSQL-12, SENSORS (receives timeseries data feed from sensors) and LABELS (contains sensor labels & meta-data). I am trying to create a PostgreSQL trigger to update the sensor data as they are inserted into SENSORS. Each row of sensor data will be updated with the corresponding label name from LABELS. Unable to get the row update to work. Sample shown below:

Sample Dataset

CREATE TABLE sensor (
   datavalue integer, 
   created timestamp,
   label_id integer,
   label_name text);

CREATE TABLE labels (
   id integer,
   label_id integer,
   label_name text);

Insert into labels values (1, 215, 'Home'), (2, 216, 'Shop'), (3, 217, 'Flat'), (4, 218, 'Street');
Insert into sensor values (67, '2020-09-02 08:40:07.354', 215),(67, '2020-09-02 08:41:07.354', 215),(67, '2020-09-02 08:41:07.354', 216);

Trigger Function

CREATE OR REPLACE FUNCTION update_name()
RETURNS trigger AS 
$func$
BEGIN
UPDATE sensor 
    SET label_name = b.label_name 
    from labels b
    where new.label_id = b.label_id;
     RETURN NEW; 
END
$func$  LANGUAGE plpgsql;

CREATE TRIGGER name_update_trigger
AFTER INSERT OR UPDATE ON sensor
FOR EACH ROW EXECUTE PROCEDURE update_name();

Test row insertion

Insert into sensor values (78, '2020-09-02 08:40:07.354', 215),(77, '2020-09-02 08:41:07.354', 215),(67, '2020-09-02 08:41:07.354', 216);

I have a very large number of rows flowing in, and would like to update each row as it is inserted.

Any assistance greatly appreciated. Thanks!

Was it helpful?

Solution

Most likely you're going in wrong direction but I have only gut feelings for that.

But, you can play with triggers - maybe it will do a work. However it seems that you will have massive updates in each trigger call because you are updating all rows from sensor with new label, which seems just wrong.

As I'm not sure about your specific needs, I will just push you in most proper way (I hope) with these changes:


CREATE OR REPLACE FUNCTION update_name()
RETURNS trigger AS
$$
BEGIN
    IF TG_OP = 'INSERT' THEN
        INSERT INTO sensor(datavalue, created, label_id, label_name)
        SELECT NEW.datavalue, NEW.created, NEW.label_id, b.label_name
        from labels b
        where new.label_id = b.label_id;

    ELSIF TG_OP = 'UPDATE' THEN
        UPDATE sensor
        SET label_name = b.label_name
        from labels b
        where new.label_id = b.label_id;

    END IF;
    RETURN NULL;

END
$$  LANGUAGE plpgsql;

CREATE TRIGGER name_update_trigger
BEFORE INSERT OR UPDATE ON sensor
FOR EACH ROW
WHEN (pg_trigger_depth() < 1)
EXECUTE PROCEDURE update_name();

Here, you do these changes instead (before) insert or update. Insert will be proper, update needs to be adjusted or removed as from your description it seems like you will be making only inserts here.

And please note that there is limiter as proposed here https://dba.stackexchange.com/a/103661/213360.

OTHER TIPS

I have to agree with Adam Tokarski's answer that the usage pattern looks a bit like it's going in an awkward direction. For simplicity, would it not make sense to treat the label as if it were any other measurement that may be updated? i.e.:

CREATE TABLE sensor (
  sensor_id integer,
  label text,
  datavalue integer,
  created timestamp);

INSERT INTO sensor VALUES 
  (215, 'Home', 61, '2020-09-02 08:40:07.354'),
  (216, 'Shop', 61, '2020-09-02 08:41:07.354'),
  (217, 'Flat', 61, '2020-09-02 08:41:07.354');

This way, you're only doing one insert, not worrying about updating, and the integer for sensor ID is constant and is your identifier.

In general, if you're struggling with performance, I suggest using a time series database for this - I write docs for QuestDB which is designed for this kind of work. A bonus optimisation you would get on the labels if they are enum-like, is using the symbol type that's faster for queries than strings as they're stored as ints internally. You just specify the label as symbol type instead of string.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top