Create cyclic pointers in postgresql 9.5
-
13-03-2021 - |
Question
The problem is to write the "ADD-VALUE" request defined below.
For each "cat" value from the table below, keep only 3 records. Figure this table:
id | cat | value | updatedAt |
---|---|---|---|
1 | cat1 | v1 | 06/01/2021 00:00:01 |
2 | cat1 | v2 | 06/01/2021 00:00:02 |
3 | cat1 | v3 | 06/01/2021 00:00:03 (pointer cat1 is here) |
4 | cat2 | v1 | 06/01/2021 00:01:01 |
5 | cat2 | v2 | 06/01/2021 00:01:02 (pointer cat2 is here) |
INSERT case: Calling "ADD-VALUE(cat=cat2, value=v3)" will produce the result in bold:
id | cat | value | updatedAt |
---|---|---|---|
1 | cat1 | v1 | 06/01/2021 00:00:01 |
2 | cat1 | v2 | 06/01/2021 00:00:02 |
3 | cat1 | v3 | 06/01/2021 00:00:03 (pointer cat1 is here) |
4 | cat2 | v1 | 06/01/2021 00:01:01 |
5 | cat2 | v2 | 06/01/2021 00:01:02 |
6 | cat2 | v3 | 06/01/2021 00:01:02 (pointer cat2 is now here) |
UPDATE case: Calling "ADD-VALUE(cat=cat1, value=v4)" will produce the result in bold:
id | cat | value | updatedAt |
---|---|---|---|
1 | cat1 | v4 | 07/01/2021 00:00:04 (pointer cat1 is now here) |
2 | cat1 | v2 | 06/01/2021 00:00:02 |
3 | cat1 | v3 | 06/01/2021 00:00:03 |
4 | cat2 | v1 | 06/01/2021 00:01:01 |
5 | cat2 | v2 | 06/01/2021 00:01:02 (pointer cat2 is here) |
Any advice is welcome. UPDATE or INSERT in one request is maybe impossible? I think about using row-num to have the count per category.
Solution
Your idea would be a pain to enforce under concurrent load. Instead, just keep adding new rows (INSERT
only). There are simple and fast queries to get the current row(s) for each cat.
Make sure, that updated_at
is current. I added a column default and a trigger for that.
To rule out duplicate entries for the same cat and the same timestamp, add a UNIQUE constraint.
CREATE TABLE cat(
id serial PRIMARY KEY
, cat text NOT NULL
, value text
, updated_at timestamptz NOT NULL DEFAULT now()
, CONSTRAINT uni_cat_updated_at UNIQUE (cat, updated_at)
);
-- trigger func & trigger
CREATE FUNCTION public.trg_cat_updated()
RETURNS trigger
LANGUAGE plpgsql AS
$func$
BEGIN
NEW.updated_at = now();
RETURN NEW;
END
$func$;
CREATE TRIGGER cat_befupd
BEFORE UPDATE ON public.cat
FOR EACH ROW EXECUTE PROCEDURE public.trg_cat_updated();
Or, to be absolutely sure, make that:
BEFORE INSERT OR UPDATE ON public.cat
The default value normally takes care of inserts cheaply, but it can be overruled with explicitly inserting a value. The trigger overrules no matter what.
To get the "current" row for a given cat:
SELECT value
FROM cat
WHERE cat = 'cat1'
ORDER BY updated_at DESC
LIMIT 1;
This is extremely fast while you have a matching index:
CREATE INDEX on cat (cat, updated_at DESC, value);
In modern Postgres versions you could use a unique covering index to replace index and UNIQUE
constraint. See:
To get the last three (live) rows, use the same query with LIMIT 3
:
From time to time (as your db load and schedule permit/require) delete deprecated rows:
DELETE FROM cat c
USING (
SELECT id, row_number() OVER (PARTITION BY cat ORDER BY updated_at DESC) AS rn
FROM cat
) d
WHERE d.rn > 3
AND c.id = d.id;
db<>fiddle here