How to Choose Between VALUES and SELECT for INSERT?
-
28-09-2020 - |
Domanda
This answer raised the question for me how to choose between VALUES
and SELECT
in such a function . Using PostgreSQL 9.4.3 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.9.2-10) 4.9.2, 64-bit:
CREATE OR REPLACE FUNCTION insaft_function()
RETURNS TRIGGER AS
$func$
BEGIN
INSERT INTO file_headers (measurement_id, file_header_index_start
, file_header_index_end)
VALUES (NEW.measurement_id, TG_ARGV[0]::int, TG_ARGV[1]::int);
RETURN NULL; -- result ignored since this is an AFTER trigger
END
$func$ LANGUAGE plpgsql;
VALUES
work with many rows but with SELECT
you can do much more.
The only requirement here is to do the above INSERT
to the table.
You can assume that there are 100k of such INSERT
s done per cycle in continuous quality assurance of a system.
I noticed these differences with my data where selected three median values here:
VALUES
real user sys
-------------------------------
0m0.353s 0m0.256s 0m0.028s
0m0.327s 0m0.252s 0m0.036s
0m0.358s 0m0.252s 0m0.040s
so average real 0.34s
SELECT
real user sys
-------------------------------
0m0.362s 0m0.256s 0m0.024s
0m0.383s 0m0.236s 0m0.056s
0m0.356s 0m0.264s 0m0.032s
so average real 0.36s
So this small subset of data says that VALUES
is faster with such a simple INSERT
. I am interested in requirements for concurrent processes and real time data analysis.
How can you decide between SELECT
and VALUES
for INSERT
?
Soluzione
The measured difference is almost certainly noise. Run some more iterations, you won't get consistent result. The difference in performance (if any exists) won't be measurable.
You can use either method here. Both are equally good for the purpose. There are often multiple ways in SQL. And sometimes there is no clear winner.
The more important issue concerning performance here:
100k of such INSERTs done per cycle
For big bulk-inserts it would be faster to INSERT
in both tables instead of firing a trigger for every row.
If you are using a serial PK that's generated automatically, you can employ the RETURNING
clause in a data-modifying CTE
WITH ins1 AS (
INSERT INTO measurement (measurement)
VALUES ... -- OR SELECT ... if data comes from inside the DB :)
RETURNING measurement_id -- generating a serial ID?
)
INSERT INTO file_headers (measurement_id, file_header_index_start, file_header_index_end)
SELECT measurement_id, 1, 666 -- here it *must* be SELECT
FROM ins1;
A numerical constant like 666
(no quotes, just digits) defaults to type integer automatically.
May not be applicable, depending on your workflow.
Related: