Question

I have the following table with the following tables and values and types.

create table example  (
fname text,
lname text,
value int);

insert into example values
('doge','coin',123),
('bit','coin',434),
('lite','coin',565),
('doge','meme',183),
('bit','meme',453),
('lite','meme',433);

create type resultrow as (
nam text,
amount int);

I would like to write a function, that groups by a parameter I give to the function. This example works:

do $$
declare
 my_parameter text;
 results resultrow[];
begin

my_parameter = 'last';

results := array(select row( case when my_parameter = 'first' then fname
            when my_parameter = 'last' then lname
       end,
       sum(salary))::resultrow
  from example
  group by case when my_parameter = 'first' then fname
            when my_parameter = 'last' then lname
       end);

raise notice '%', results;
end;
$$ language plpgsql;

I have been told, that CASE WHEN decisions are really expensive. One obvious solution would be to create the select statements twice:

if my_parameter = 'first' then
  results := array(select row(fname,sum(salary))::resultrow
  from example
  group by fname);
end if;

if my_parameter = 'last' then
  results := array(select row(lname,sum(salary))::resultrow
  from example
  group by lname);
end if;

But this leads to a lot of ugly duplicated code.

Is there another solution to make the group by parameterisable?

Was it helpful?

Solution

If you don't want to use case, you can use this:

with cte(name, salary) as (
    select fname, salary from example where my_parameter = 'first'
    union all
    select lname, salary from example where my_parameter = 'last'
) 
select name, sum(salary)
from cte
group by name

But, actually, it's better to test, I've not heard that case is expensive.

If you'll find that case is not expensive, I still suggest use subquery or cte to avoid code duplication, like:

with cte(name, salary) as (
    select
        case
            when my_parameter = 'first' then fname
            when my_parameter = 'last' then lname
        end as name,
        salary
    from example
) 
select name, sum(salary)
from cte
group by name

OTHER TIPS

Simplify what you have:

DO
$do$
DECLARE
   _param  text := 'last'; -- one can assign at declaration time
   results resultrow[];
BEGIN

results := ARRAY(
   SELECT t::resultrow     -- refer to table alias to get whole row
   FROM (
      SELECT CASE _param   -- simple "switched" CASE
               WHEN 'first' THEN fname
               WHEN 'last'  THEN lname
             END
            ,sum(salary)
      FROM   example
      GROUP  BY 1          -- simpler with positional reference
      ) t
   );

RAISE NOTICE '%', results;

END
$do$ LANGUAGE plpgsql;

Using simple CASE syntax variant. This way the expression is only evaluated once and the syntax is simpler. Since your question refers to CASE - even if that's hardly relevant.

Also using a positional reference in the GROUP BY clause. This seems relevant to the title of your question. More explanation in these related answers:
Select first row in each GROUP BY group?
GROUP BY + CASE statement

This kind of query can be very inefficient. It's not a problem of the (very cheap!) CASE statement per se. It's because the planner has to provide for varying input in the first column and may be forced to use a generic, less optimized plan.

Dynamic SQL

I assume the actual goal is to write a function that takes my_parameter. Use dynamic SQL with EXECUTE, which will likely result in a superior query plan, i.e. superior performance. There are lots of code example here, try a search.

Also, I return a set of resultrow instead of the awkward ARRAY you had in your example (since you cannot return from a DO statement):

CREATE FUNCTION f_salaray_for_param(_param text)
  RETURNS SETOF resultrow AS
$func$
DECLARE
   _fld    text :=
      CASE _param
        WHEN 'first' THEN 'fname'  -- SQL injection not possible
        WHEN 'last'  THEN 'lname'
      END;
BEGIN

IF _fld IS NULL THEN               -- exception for invalid params
   RAISE EXCEPTION 'Unexpected value for _param: %', _param;
END IF;

RETURN QUERY EXECUTE '
   SELECT ' || _fld || ', sum(salary)
   FROM   example
   GROUP  BY 1';                   -- query is very simple now

END
$func$ LANGUAGE plpgsql;

Call:

SELECT * FROM f_salaray_for_param('first');

BTW, the plpgsql assignment operator is := (not =).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top