Why is intarray's push `+` so much faster than array-to-element concatenation `||`?
Domanda
When it comes to using arrays of ints, I always default to using intarray
. But, now I'm wondering why is the default concatenation so much slower.
This answer probably requires some understanding of internals. Internally, intarray
's calls intarray_push_elem
, and ||
calls array_append
Sample Data
CREATE EXTENSION intarray;
CREATE TABLE foo AS
SELECT ARRAY[x::int] AS bar
FROM generate_series(1,1e7)
AS gs(x);
Same result
SELECT ARRAY[1] + 42 = ARRAY[1] || 42;
?column?
----------
t
(1 row)
With ||
EXPLAIN ANALYZE
SELECT bar || 42 FROM foo;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Seq Scan on foo (cost=0.00..198528.55 rows=9999884 width=25) (actual time=0.029..5193.752 rows=10000000 loops=1)
Planning time: 0.051 ms
Execution time: 5679.489 ms
(3 rows)
With +
from intarray
extension
EXPLAIN ANALYZE
SELECT bar + 42 FROM foo;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Seq Scan on foo (cost=0.00..198528.55 rows=9999884 width=25) (actual time=0.026..2066.786 rows=10000000 loops=1)
Planning time: 0.052 ms
Execution time: 2388.462 ms
(3 rows)
Routinely intarray
's +
is faster than ||
. Why is ||
so slow?
Soluzione
It seems as if ||
is doing a lot more for multidimensional arrays.
For instance with ||
catching the following case
SELECT ARRAY[ARRAY[1]] || 5;
ERROR: argument must be empty or one-dimensional array
Whereas, with +
the array is flattened..
SELECT ARRAY[ARRAY[1]] + 5;
?column?
----------
{1,5}
(1 row)
In addition it seems intarray does not handle nulls.
SELECT ARRAY[null]::int[] + 5;
ERROR: array must not contain nulls
It also seems ||
uses PG_RETURN_DATUM
instead of passing by reference which seems to be what +
uses with PG_RETURN_POINTER
.
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a dba.stackexchange