Pergunta

I am trying to write a plpgsql function that runs a query where the WHERE part is variable using EXECUTE, like this :

CREATE FUNCTION test1(p_filter TEXT) RETURNS SETOF Bin AS $$
    DECLARE
        q TEXT;
    BEGIN
        q = FORMAT('SELECT * FROM Bin WHERE %s;', p_filter);
        RETURN QUERY EXECUTE q;
    END
$$ LANGUAGE plpgsql;

The function was running much slower than using the query directly so I tried to create a function where I would juste return the query like this :

CREATE FUNCTION test2() RETURNS SETOF Bin AS $$
    BEGIN
        RETURN QUERY SELECT * FROM Bin WHERE ((bin.binFrac).linedivisionnb = 0 AND (bin.binFrac).columndivisionnb = 0 AND (bin.binFrac).height = 64 AND bin._weight = 0);
    END
$$ LANGUAGE plpgsql;

In both cases, using auto_explain I got this plan :

2021-01-12 11:18:14.524 CET [19356] postgres@astar LOG:  duration: 26.203 ms  plan:
    Query Text: SELECT * FROM Bin WHERE ((bin.binFrac).linedivisionnb = 0 AND (bin.binFrac).columndivisionnb = 0 AND (bin.binFrac).height = 64 AND bin._weight = 0);
    Seq Scan on bin  (cost=0.00..3779.06 rows=1 width=181) (actual time=0.008..22.763 rows=72687 loops=1)
      Filter: (((binfrac).linedivisionnb = 0) AND ((binfrac).columndivisionnb = 0) AND (_weight = 0) AND ((binfrac).height = 64))
      Rows Removed by Filter: 14316
2021-01-12 11:18:14.524 CET [19356] postgres@astar CONTEXT:  PL/pgSQL function test1(text) line 6 at RETURN QUERY
2021-01-12 11:18:14.532 CET [19356] postgres@astar LOG:  duration: 45.242 ms  plan:
    Query Text: EXPLAIN ANALYZE SELECT * FROM test1('((bin.binFrac).linedivisionnb = 0 AND (bin.binFrac).columndivisionnb = 0 AND (bin.binFrac).height = 64 AND bin._weight = 0)');
    Function Scan on test1  (cost=0.25..10.25 rows=1000 width=511) (actual time=36.904..43.560 rows=72687 loops=1)

When running the query directly or in an SQL STABLE function I only get the first part :

2021-01-12 11:25:30.976 CET [19356] postgres@astar LOG:  duration: 27.241 ms  plan:
    Query Text: EXPLAIN ANALYZE SELECT * FROM Bin WHERE ((bin.binFrac).linedivisionnb = 0 AND (bin.binFrac).columndivisionnb = 0 AND (bin.binFrac).height = 64 AND bin._weight = 0);
    Seq Scan on bin  (cost=0.00..3779.06 rows=1 width=181) (actual time=0.009..25.511 rows=72687 loops=1)
      Filter: (((binfrac).linedivisionnb = 0) AND ((binfrac).columndivisionnb = 0) AND (_weight = 0) AND ((binfrac).height = 64))
      Rows Removed by Filter: 14316

Where does the extra 20ms come from ? Is there a way to remove this overhead or at least mitigate it ?

Foi útil?

Solução

The extra times goes to the overhead of plpgsql, which is not small.

With RETURN QUERY, it first reads all of the tuples into a tuple store, and then reads them back out again. In this case, I think that that accounts for most of the additional time. Note that this extra overhead will be proportional to how rows are returned (not how many are inspected but then filtered out).

Licenciado em: CC-BY-SA com atribuição
Não afiliado a dba.stackexchange
scroll top