how to get the first (or any single) value in GROUP BY without ARRAY_AGG?

https://stackoverflow.com/questions/21411872

03-10-2022
|

Question

I'm migrating some SQL from PostgreSQL 9.2 to Vertica 7.0, and I could use some help replacing postgres's cool array_agg feature with something that Vertica (and possibly other RDBMS) supports, such as partitions and window functions. I'm new to these features, and I'd really appreciate your ideas.

The (working) query using array_agg ( sql fiddle demo ):

SELECT B.id, (array_agg(A.X))[1]
FROM B, AB, A
WHERE B.id = AB.B_id AND A.id = AB.A_id AND A.X IS NOT NULL
GROUP BY B.id;

If I try to naively select A.X by itself without the aggregation (i.e., to let the RDBMS pick - actually works with MySQL and SQLite), postgres complains. Running the same query but with "A.X" instead of "(array_agg(A.X))1":

ERROR:  column "a.x" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT B.id, A.X

I was thinking of trying a window function, e.g., something like from this question:

SELECT email, FIRST_VALUE(email) OVER (PARTITION BY email)
FROM questions
GROUP BY email;

but I get the same error:

SELECT B.id, FIRST_VALUE(A.X) OVER (PARTITION BY A.id)
FROM B, AB, A
WHERE B.id = AB.B_id AND A.id = AB.A_id AND A.X IS NOT NULL
GROUP BY B.id;

ERROR:  column "a.x" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT B.id AS id, FIRST_VALUE(A.X) OVER (PARTITION BY A.id)...

Note that we don't care so much about getting the first value, we just need any (ideally deterministic) single value.

Thank you in advance.

Solution

@a_horse_with_no_name's comment, along with that of Denis, was what we needed to rethink our approach. We have switched to MIN(). Thanks!

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow