Question

The problem is to write the "ADD-VALUE" request defined below.

For each "cat" value from the table below, keep only 3 records. Figure this table:

id cat value updatedAt
1 cat1 v1 06/01/2021 00:00:01
2 cat1 v2 06/01/2021 00:00:02
3 cat1 v3 06/01/2021 00:00:03 (pointer cat1 is here)
4 cat2 v1 06/01/2021 00:01:01
5 cat2 v2 06/01/2021 00:01:02 (pointer cat2 is here)

INSERT case: Calling "ADD-VALUE(cat=cat2, value=v3)" will produce the result in bold:

id cat value updatedAt
1 cat1 v1 06/01/2021 00:00:01
2 cat1 v2 06/01/2021 00:00:02
3 cat1 v3 06/01/2021 00:00:03 (pointer cat1 is here)
4 cat2 v1 06/01/2021 00:01:01
5 cat2 v2 06/01/2021 00:01:02
6 cat2 v3 06/01/2021 00:01:02 (pointer cat2 is now here)

UPDATE case: Calling "ADD-VALUE(cat=cat1, value=v4)" will produce the result in bold:

id cat value updatedAt
1 cat1 v4 07/01/2021 00:00:04 (pointer cat1 is now here)
2 cat1 v2 06/01/2021 00:00:02
3 cat1 v3 06/01/2021 00:00:03
4 cat2 v1 06/01/2021 00:01:01
5 cat2 v2 06/01/2021 00:01:02 (pointer cat2 is here)

Any advice is welcome. UPDATE or INSERT in one request is maybe impossible? I think about using row-num to have the count per category.

Was it helpful?

Solution

Your idea would be a pain to enforce under concurrent load. Instead, just keep adding new rows (INSERT only). There are simple and fast queries to get the current row(s) for each cat.

Make sure, that updated_at is current. I added a column default and a trigger for that. To rule out duplicate entries for the same cat and the same timestamp, add a UNIQUE constraint.

CREATE TABLE cat(
  id         serial PRIMARY KEY
, cat        text NOT NULL
, value      text
, updated_at timestamptz NOT NULL DEFAULT now()
, CONSTRAINT uni_cat_updated_at UNIQUE (cat, updated_at)
);

-- trigger func & trigger
CREATE FUNCTION public.trg_cat_updated()
  RETURNS trigger
  LANGUAGE plpgsql AS
$func$
BEGIN
   NEW.updated_at = now();
   RETURN NEW;
END
$func$;

CREATE TRIGGER cat_befupd
BEFORE UPDATE ON public.cat
FOR EACH ROW EXECUTE PROCEDURE public.trg_cat_updated();

Or, to be absolutely sure, make that:

BEFORE INSERT OR UPDATE ON public.cat

The default value normally takes care of inserts cheaply, but it can be overruled with explicitly inserting a value. The trigger overrules no matter what.

To get the "current" row for a given cat:

SELECT value
FROM   cat
WHERE  cat = 'cat1'
ORDER  BY updated_at DESC
LIMIT  1;

This is extremely fast while you have a matching index:

CREATE INDEX on cat (cat, updated_at DESC, value);

In modern Postgres versions you could use a unique covering index to replace index and UNIQUE constraint. See:

To get the last three (live) rows, use the same query with LIMIT 3:

From time to time (as your db load and schedule permit/require) delete deprecated rows:

DELETE FROM cat c
USING (
   SELECT id, row_number() OVER (PARTITION BY cat ORDER BY updated_at DESC) AS rn
   FROM cat
   ) d
WHERE  d.rn > 3
AND    c.id = d.id;

db<>fiddle here

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top