Question

I have the following table:

    id       time      text      otheridentifier
    -------------------------------------------
    1        6         apple     4
    2        7         orange    4
    3        8         banana    3
    4        9         pear      3
    5        10        grape     2

What I want to do is select the 3 most recent records (by time desc), whose otheridentifiers are distinct. So in this case, the result would be id's: 5, 4, and 2.

id = 3 would be skipped because there's a more recent record with the same otheridentifier field.

Here's what I tried to do:

SELECT * FROM `table` GROUP BY (`otheridentifier`) ORDER BY `time` DESC LIMIT 3

However, I end up getting rows of id = 5, 3, and 1 instead of 5, 4, 2 as expected.

Can someone tell me why this query wouldn't return what I expected? I tried changing the ORDER BY to ASC but this simply rearranges the returned rows to 1, 3, 5.

Was it helpful?

Solution

It doesn't return what you expect because grouping happens before ordering, as reflected by the position of the clauses in the SQL statement. You're unfortunately going to have to get fancier to get the rows you want. Try this:

SELECT *
FROM `table`
WHERE `id` = (
    SELECT `id`
    FROM `table` as `alt`
    WHERE `alt`.`otheridentifier` = `table`.`otheridentifier`
    ORDER BY `time` DESC
    LIMIT 1
)
ORDER BY `time` DESC
LIMIT 3

OTHER TIPS

You could join the table on itself to filter the last entry per otheridentifier, and then take the top 3 rows of that:

SELECT last.*
FROM `table` last
LEFT JOIN `table` prev 
    ON prev.`otheridentifier` = last.`otheridentifier`
    AND prev.`time` < last.`time`
WHERE prev.`id` is null
ORDER BY last.`time` DESC 
LIMIT 3

I had a similar requirement, but I had more advanced selection criteria. Using some of the other answers I couldn't get exactly what I needed, but I found you can still do a GROUP BY after and ORDER BY like this:

SELECT t.* FROM (SELECT * FROM table ORDER BY time DESC) t 
GROUP BY t.otheridentifier
SELECT * FROM table t1 
WHERE t1.time = 
    (SELECT MAX(time) FROM table t2 
     WHERE t2.otheridentifier = t1.otheridentifier)

Andomar's answer is probably best in that it doesn't use a subquery.

An alternative approach:

select *
from   `table` t1
where  t1.`time` in (
                    select   max(s2.`time`)
                    from     `table` t2
                    group by t2.otheridentifier
                    )

You can use this query to get correct answer:

SELECT * FROM 
      (SELECT * FROM `table` order by time DESC)
          t group by otheridentifier

what about

SELECT *, max(time) FROM `table`  group by otheridentifier

This also:

SELECT * FROM
OrigTable T INNER JOIN 
( 
SELECT otheridentifier,max(time) AS duration
FROM T
GROUP BY otheridentifier) S
ON S.duration = T.time AND S.otheridentifier = T.otheridentifier.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top