Question

I have been creating some functions lately that run into the "out of shared memory" warning in postgres. I have tried to increase the max_locks_per_transaction, but it still occurs.

As far as i understand from the answer in this post ("PostgreSQL complaining about shared memory, but shared memory seems to be OK") creating and dropping temp tables creates locks which can result in the earlier-mentioned warning.

for example:

create or replace function remove_vertices(par_geom geometry) 
returns geometry as $$ 
DECLARE 
cnt int;
BEGIN

drop table if exists invalid_pnts;
create temp table invalid_pnts as 
select * from selfintersects(par_geom) ; 
-- calls other function. selfintersects(par_geom) returns X rows with 4 columns/attributes, 
-- where X corresponds to number of faulty vertices in a postgis geometry
-- can return 0 rows   

select count(*) into cnt from invalid_pnts;
if cnt > 0 
then -- do something with invalid_pnts table and the input "par_geom"
    
else 
    return par_geom 

end $$
language 'plpgsql';

Since this function is supposed to be called per row of a table (i.e. select remove_vertices(geom) from some_table), the dropping and creation of the temp table can occur as many times as there are rows in a table.

What is an alternative to dropping/creating temp tables, if you need a "table variable" in a function?

Was it helpful?

Solution

Wouldn't a simple loop be enough?

create or replace function remove_vertices(par_geom geometry) 
returns geometry as $$ 
DECLARE 
  declare invalid_pnts record;
begin
  for invalid_pnts in select * from selfintersects(par_geom)
  loop
    -- do something with every record
  end loop;

  if not found then
    -- loop was empty
    return par_geom;
  end if;
end;
$$ language plpgsql;

Frequent creation of temporary tables will also bloat the system catalog and therefore slow down each and every query. It is really a bad idea to create and drop temporary tables (say a few tens per second) frequently in PostgreSQL.

OTHER TIPS

This is really hard to answer, because you don't show the big picture. I would be inclined to do everything in a single statement that joins to the result of the function call, rather than storing the result somewhere. So that no intermediate storage is needed at all. But without seeing more of the code (the "calls other function..." part), I can't offer any alternatives.

One way to work around the temp table could also be to store the result into arrays. Something like:

select array_agg(c1), array_agg(c2), array_agg(c2), array_agg(c3)
  into c1_result, c2_result, c3_result, c4_result
from selfintersects(par_geom);

Then iterate over the array contents. Of course, this is only feasible if the function doesn't return millions of rows. Hundreds or a few thousands should be OK though.

Alternatively you create a type that represents the function's result and aggregate the result into an array of the record type (rather than having four arrays).


I would probably try to re-write the function to be able to work on multiple rows and then join that in the outer query rather calling it in the SELECT list as you seem to be doing. Then maybe you don't need any intermediate storage at all. But without more details I can't suggest any possible way to do it.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top