Return values at more/less specific time points
-
26-12-2020 - |
Question
In PostgreSQL 10.0, I have a big table with two columns:
CREATE TABLE xs (
ts timestamp with time zone NOT NULL,
x integer NOT NULL
)
Now, if I have 100 given time points, how do I return exactly 100 x
-es that are right before (<=
) these time points (for each of those points)?
For just a single one this would be very easy:
SELECT x FROM xs WHERE ts <= '2018-03-13 11:41:47.167973+00' LIMIT 1
But how to do this efficiently for a 100? Something like:
SELECT x FROM xs
WHERE ts FIRST_LESS_THAN_EQUAL_IN
('2018-03-13 11:41:47.167973+00',
'2018-03-13 11:41:47.198564+00',
'2018-03-13 11:41:47.555668+00',
...
'2018-03-13 11:41:57.766888+00')
Obviously, there is an index on ts
.
Solution
I suggest a LATERAL
subquery, best with a LEFT [OUTER] JOIN
to preserve all input rows (displaying a NULL value in case there should be no match at all):
SELECT v.reference_ts, t.most_recent_x
FROM (
VALUES
(timestamptz '2018-03-13 11:41:47.167973+00') -- type cast in first row
, ('2018-03-13 11:41:47.198564+00')
, ('2018-03-13 11:41:47.555668+00')
, ('2018-03-13 11:41:57.766888+00')
) AS v(reference_ts)
LEFT JOIN LATERAL (
SELECT x AS most_recent_x
FROM xs
WHERE ts <= v.reference_ts -- ts is unique (?!); see below
ORDER BY ts DESC
LIMIT 1
) AS t ON true;
A VALUES
expression is one way to put your 100 timestamps into a derived table we can work with. The free-standing VALUES
expression may require an explicit type cast in the first row. See:
There are others ways to pass your values, like an array (or an array literal cast to timestamptz[]
) and unnest()
in the query:
About the LATERAL
subquery:
Don't forget the ORDER BY
or you get arbitrary results - like @a1ex07 commented.
If ts
is not unique, it's undefined which of the duplicates is returned, unless you add more ORDER BY
items to make it deterministic. Like: ORDER BY ts DESC, x DESC
to get the row with the biggest x
from any set of peers. And make the index match.
An index on (ts)
(like you mentioned) works wonders on performance.
Or, ideally, on (ts DESC, x)
if preconditions for index-only scans are met.
dbfiddle here