Is there any way in Postgres to parameterise a procedure for sort order column, when the columns are of different types?
-
08-03-2021 - |
Question
I am trying to write a function which does (some complex stuff) and returns the results in different orders based on a parameter.
A simplified version would look something like this:
CREATE OR REPLACE FUNCTION test(order_column text)
RETURNS TABLE(thing1 bigint,thing2 text, thing3 timestamp without time zone)
LANGUAGE 'plpgsql'
AS $BODY$
BEGIN
RETURN QUERY
SELECT thing1, thing2::text, thing3 FROM some_table
ORDER BY
CASE WHEN order_column='id' THEN thing1
ELSE thing3
END
DESC;
END;
$BODY$;
Unfortunately, thing1 is a bigint and thing3 is a timestamp, and when I try to run the function I get an error saying bigint and timestamp types can't be matched, which I interpret as saying that the types returned from the case need to be the same (or at least compatible). I can't cast them both to text, because the range of values don't sort correctly then.
I've tried returning the column numbers instead of the column names - this at least executes, but it ignores the column order (in the function or just executing as a simple statement). For example,
SELECT * FROM some_table ORDER BY 1;
works correctly but
SELECT * FROM some_table ORDER BY CASE WHEN TRUE THEN 1 ELSE 2 END;
does not order by column 1
My work-around would be to do
if column_order='first' then
(masses of complex stuff)
SELECT ... ORDER BY thing1
else
(masses of complex stuff, duplicated)
SELECT ... ORDER BY thing3
end if;
but that's horrible, and I'm really hoping there's some other way around this, and that I'm currently missing something.
Is there any way to do what I'm trying to do?
Solution
Be careful with conditional ordering, it can create bad query plans sometimes forcing table scanning. If the filtering and joining clauses, or just the size of the actual data, mean that you have a small number of rows to sort at the end then this is not an issue and something like this will work:
ORDER BY CASE WHEN ordering_column = 'id' THEN id ELSE NULL END
, CASE WHEN ordering_column = 'timestamp' THEN timestamp ELSE NULL END
In fact it will work anyway, it might just be inefficient for a large amount of data.
For larger outputs your workaround may be more efficient as it may be able to make better use of indexes for the sorting. Another alternative is to have two procedures, one for each sort, and either call each as needed or have your main procedure call the others depending on the sort order it is passed in the parameter. Depending on how postgres handles cached query plans for procedures this may[†] avoid issues of a cached plan for one case being used for another where it is vastly less efficient.
[†] I'm no expert at all on pg's internals, but single "kitchen sink" procedures and queries with conditional sorts etc. can be a performance killer in SQL Server for this sort of reason.