Question

CREATE TABLE public.temp (
   id integer, 
   a integer, 
   b integer, 
   x integer
);

select x, a, b
    from temp
group by id, a, b;

ends up with expected error:

ERROR:  column "temp.x" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select x, a, b
               ^

However, after adding primary key:

ALTER TABLE ONLY temp ADD  PRIMARY KEY (id);

... the previous SELECT query runs without error.

Why can column x now be used without aggregation?
Is the behaviour documented?

Was it helpful?

Solution

The PRIMARY KEY in the GROUP BY clause explicitly covers the whole row since Postgres 9.1. The release notes:

Allow non-GROUP BY columns in the query target list when the primary key is specified in the GROUP BY clause (Peter Eisentraut)

The manual:

When GROUP BY is present, or any aggregate functions are present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions or when the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column. A functional dependency exists if the grouped columns (or a subset thereof) are the primary key of the table containing the ungrouped column.

Related:

OTHER TIPS

GROUP BY by primary key column causes each row to become a separate group. In this case the column not included into grouping / aggregating has only one value within the group, so this value may be returned without ambiguity.

If there is no primary column / expression as a part of grouping expression then server must consider that there may be more than one value for a group, and it generates error because of possible ambiguity.

In general the fact that the error is not generated is an extension it seems...

If PK is a part of grouping expression than formally all another parts are excess. I.e. if id is PK then group by id, group by id,a,b,x, group by id,a,b, group by id,b,x, group by id,a,x, ... will have the same meaning and effect.

But the best practice is to use an aggregate function for each non-groupby column / expression in any case.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top