How to concatenate strings of a string field in a PostgreSQL 'group by' query?

https://stackoverflow.com/questions/43870

09-06-2019
|

Question

I am looking for a way to concatenate the strings of a field within a group by query. So for example, I have a table:

ID   COMPANY_ID   EMPLOYEE
1    1            Anna
2    1            Bill
3    2            Carol
4    2            Dave

and I wanted to group by company_id to get something like:

COMPANY_ID   EMPLOYEE
1            Anna, Bill
2            Carol, Dave

There is a built-in function in mySQL to do this group_concat

Solution

PostgreSQL 9.0 or later:

Recent versions of Postgres (since late 2010) have the string_agg(expression, delimiter) function which will do exactly what the question asked for, even letting you specify the delimiter string:

SELECT company_id, string_agg(employee, ', ')
FROM mytable
GROUP BY company_id;

Postgres 9.0 also added the ability to specify an ORDER BY clause in any aggregate expression; otherwise, the order is undefined. So you can now write:

SELECT company_id, string_agg(employee, ', ' ORDER BY employee)
FROM mytable
GROUP BY company_id;

Or indeed:

SELECT string_agg(actor_name, ', ' ORDER BY first_appearance)

PostgreSQL 8.4 or later:

PostgreSQL 8.4 (in 2009) introduced the aggregate function array_agg(expression) which concatenates the values into an array. Then array_to_string() can be used to give the desired result:

SELECT company_id, array_to_string(array_agg(employee), ', ')
FROM mytable
GROUP BY company_id;

`string_agg` for pre-9.0 versions:

In case anyone comes across this looking for a compatibilty shim for pre-9.0 databases, it is possible to implement everything in string_agg except the ORDER BY clause.

So with the below definition this should work the same as in a 9.x Postgres DB:

SELECT string_agg(name, '; ') AS semi_colon_separated_names FROM things;

But this will be a syntax error:

SELECT string_agg(name, '; ' ORDER BY name) AS semi_colon_separated_names FROM things;
--> ERROR: syntax error at or near "ORDER"

Tested on PostgreSQL 8.3.

CREATE FUNCTION string_agg_transfn(text, text, text)
    RETURNS text AS 
    $$
        BEGIN
            IF $1 IS NULL THEN
                RETURN $2;
            ELSE
                RETURN $1 || $3 || $2;
            END IF;
        END;
    $$
    LANGUAGE plpgsql IMMUTABLE
COST 1;

CREATE AGGREGATE string_agg(text, text) (
    SFUNC=string_agg_transfn,
    STYPE=text
);

Custom variations (all Postgres versions)

Prior to 9.0, there was no built-in aggregate function to concatenate strings. The simplest custom implementation (suggested by Vajda Gabo in this mailing list post, among many others) is to use the built-in textcat function (which lies behind the || operator):

CREATE AGGREGATE textcat_all(
  basetype    = text,
  sfunc       = textcat,
  stype       = text,
  initcond    = ''
);

Here is the CREATE AGGREGATE documentation.

This simply glues all the strings together, with no separator. In order to get a ", " inserted in between them without having it at the end, you might want to make your own concatenation function and substitute it for the "textcat" above. Here is one I put together and tested on 8.3.12:

CREATE FUNCTION commacat(acc text, instr text) RETURNS text AS $$
  BEGIN
    IF acc IS NULL OR acc = '' THEN
      RETURN instr;
    ELSE
      RETURN acc || ', ' || instr;
    END IF;
  END;
$$ LANGUAGE plpgsql;

This version will output a comma even if the value in the row is null or empty, so you get output like this:

a, b, c, , e, , g

If you would prefer to remove extra commas to output this:

a, b, c, e, g

Then add an ELSIF check to the function like this:

CREATE FUNCTION commacat_ignore_nulls(acc text, instr text) RETURNS text AS $$
  BEGIN
    IF acc IS NULL OR acc = '' THEN
      RETURN instr;
    ELSIF instr IS NULL OR instr = '' THEN
      RETURN acc;
    ELSE
      RETURN acc || ', ' || instr;
    END IF;
  END;
$$ LANGUAGE plpgsql;

OTHER TIPS

How about using Postgres built-in array functions? At least on 8.4 this works out of the box:

SELECT company_id, array_to_string(array_agg(employee), ',')
FROM mytable
GROUP BY company_id;

As from PostgreSQL 9.0 you can use the aggregate function called string_agg. Your new SQL should look something like this:

SELECT company_id, string_agg(employee, ', ')
FROM mytable
GROUP BY company_id;

I claim no credit for the answer because I found it after some searching:

What I didn't know is that PostgreSQL allows you to define your own aggregate functions with CREATE AGGREGATE

This post on the PostgreSQL list shows how trivial it is to create a function to do what's required:

CREATE AGGREGATE textcat_all(
  basetype    = text,
  sfunc       = textcat,
  stype       = text,
  initcond    = ''
);

SELECT company_id, textcat_all(employee || ', ')
FROM mytable
GROUP BY company_id;

As already mentioned, creating your own aggregate function is the right thing to do. Here is my concatenation aggregate function (you can find details in French):

CREATE OR REPLACE FUNCTION concat2(text, text) RETURNS text AS '
    SELECT CASE WHEN $1 IS NULL OR $1 = \'\' THEN $2
            WHEN $2 IS NULL OR $2 = \'\' THEN $1
            ELSE $1 || \' / \' || $2
            END; 
'
 LANGUAGE SQL;

CREATE AGGREGATE concatenate (
  sfunc = concat2,
  basetype = text,
  stype = text,
  initcond = ''

);

And then use it as:

SELECT company_id, concatenate(employee) AS employees FROM ...

This latest announcement list snippet might be of interest if you'll be upgrading to 8.4:

Until 8.4 comes out with a super-effient native one, you can add the array_accum() function in the PostgreSQL documentation for rolling up any column into an array, which can then be used by application code, or combined with array_to_string() to format it as a list:

http://www.postgresql.org/docs/current/static/xaggr.html

I'd link to the 8.4 development docs but they don't seem to list this feature yet.

Following up on Kev's answer, using the Postgres docs:

First, create an array of the elements, then use the built-in array_to_string function.

CREATE AGGREGATE array_accum (anyelement)
(
 sfunc = array_append,
 stype = anyarray,
 initcond = '{}'
);

select array_to_string(array_accum(name),'|') from table group by id;

Following yet again on the use of a custom aggregate function of string concatenation: you need to remember that the select statement will place rows in any order, so you will need to do a sub select in the from statement with an order by clause, and then an outer select with a group by clause to aggregate the strings, thus:

SELECT custom_aggregate(MY.special_strings)
FROM (SELECT special_strings, grouping_column 
        FROM a_table 
        ORDER BY ordering_column) MY
GROUP BY MY.grouping_column

I found this PostgreSQL documentation helpful: http://www.postgresql.org/docs/8.0/interactive/functions-conditional.html.

In my case, I sought plain SQL to concatenate a field with brackets around it, if the field is not empty.

select itemid, 
  CASE 
    itemdescription WHEN '' THEN itemname 
    ELSE itemname || ' (' || itemdescription || ')' 
  END 
from items;

Use STRING_AGG function for PostgreSQL and Google BigQuery SQL:

SELECT company_id, STRING_AGG(employee, ', ')
FROM employees
GROUP BY company_id;

According to version PostgreSQL 9.0 and above you can use the aggregate function called string_agg. Your new SQL should look something like this:

SELECT company_id, string_agg(employee, ', ')
    FROM mytable GROUP BY company_id;

You can also use format function. Which can also implicitly take care of type conversion of text, int, etc by itself.

create or replace function concat_return_row_count(tbl_name text, column_name text, value int)
returns integer as $row_count$
declare
total integer;
begin
    EXECUTE format('select count(*) from %s WHERE %s = %s', tbl_name, column_name, value) INTO total;
    return total;
end;
$row_count$ language plpgsql;


postgres=# select concat_return_row_count('tbl_name','column_name',2); --2 is the value

I'm using Jetbrains Rider and it was a hassle copying the results from above examples to re-execute because it seemed to wrap it all in JSON. This joins them into a single statement that was easier to run

select string_agg('drop table if exists "' || tablename || '" cascade', ';') 
from pg_tables where schemaname != $$pg_catalog$$ and tableName like $$rm_%$$

If you are on Amazon Redshift, where string_agg is not supported, try using listagg.

SELECT company_id, listagg(EMPLOYEE, ', ') as employees
FROM EMPLOYEE_table
GROUP BY company_id;

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow

How to concatenate strings of a string field in a PostgreSQL 'group by' query?

PostgreSQL 9.0 or later:

PostgreSQL 8.4 or later:

string_agg for pre-9.0 versions:

Custom variations (all Postgres versions)

`string_agg` for pre-9.0 versions: