Question

I have two tables, room and message, in a chat database :

CREATE TABLE room (
    id serial primary key,
    name varchar(50) UNIQUE NOT NULL,
    private boolean NOT NULL default false,
    description text NOT NULL
);

CREATE TABLE message (
    id bigserial primary key,
    room integer references room(id),
    author integer references player(id),
    created integer NOT NULL,
);

Let's say I want to get the rooms with the numbers of messages from an user and dates of most recent message :

 id | number | last_created | description |      name        | private 
----+--------+--------------+-------------+------------------+---------
  2 |   1149 |   1391703964 |             | Dragons & co     | t
  8 |    136 |   1391699600 |             | Javascript       | f
 10 |     71 |   1391684998 |             | WBT              | t
  1 |     86 |   1391682712 |             | Miaou            | f
  3 |    423 |   1391681764 |             | Code & Baguettes | f
  ...

I see two solutions :

1) selecting/grouping on the messages and using subqueries to get the room columns :

select m.room as id, count(*) number, max(created) last_created,
(select name from room where room.id=m.room),
(select description from room where room.id=m.room),
(select private from room where room.id=m.room)
from message m where author=$1 group by room order by last_created desc limit 10

This makes 3 almost identical subqueries. This looks very dirty. I could reverse it to do only 2 suqueries on message columns but it wouln't be much better.

2) selecting on both tables and using aggregate functions for all columns :

select room.id, count(*) number, max(created) last_created,
max(name) as name, max(description) as description, bool_or(private) as private
from message, room
where message.room=room.id and author=$1
group by room.id order by last_created desc limit 10

All those aggregate functions look messy and useless.

Is there a clean solution here ?

It looks like a general problem to me. Theoretically, those aggregate functions are useless as, by construct, all the joined rows are the same row. I'd like to know if there's a general solution.

Was it helpful?

Solution

Try performing the grouping in a subquery:

select m.id, m.number, m.last_created, r.name, r.description, r.private
from (
    select m.room as id, count(*) number, max(created) last_created
    from message m 
    where author=$1 
    group by room 
) m
 join room r
   on r.id = m.id
order by m.last_created desc limit 10

Edit: Another option (likely with similar performance) is to move that aggregation into a view, something like:

create view MessagesByRoom
as 
select m.author, m.room, count(*) number, max(created) last_created,
from message m 
group by author, room

And then use it like:

select m.room, m.number, m.last_created, r.name, r.description, r.private
from MessagesByRoom m
 join room r
   on r.id = m.room
where m.author = $1
order by m.last_created desc limit 10

OTHER TIPS

Maybe use a join?

SELECT 
  r.id, count(*) number_of_posts, 
  max(m.created) last_created,
  r.name, r.description, r.private
FROM room r
JOIN message m on r.id = m.room
WHERE m.author = $1
GROUP BY r.id 
ORDER BY last_created desc

You can include the columns in the group by:

select room.id, count(*) number, max(message.created) last_created,
       room.name, room.description, room.private
from message join
     room
     on message.room=room.id and author=$1
group by room.id, name, description, private
order by last_created desc
limit 10;

EDIT:

This query will work in more recent versions of Postgres:

select room.id, count(*) number, max(message.created) last_created,
       room.name, room.description, room.private
from message join
     room
     on message.room=room.id and author=$1
group by room.id
order by last_created desc
limit 10;

Earlier versions of the documentation are pretty clear that you would need to include all the columns:

When GROUP BY is present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions, since there would be more than one possible value to return for an ungrouped column.

The ANSI standard actually does allow the above query with just group by room.id. This is a rather recent addition to the functionality of databases that support it.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top