Question

Suppose we have two tables. Post and Comment. Post has many Comments. Pretend they are somewhat filled so that the number of comments per post is varied. I want a query which will grab all posts but only the newest comment per post.

I have been directed to joins and sub queries but I can't figure it out.

Example Output:

Post1: Comment4 (newest for post1)

Post2: Comment2 (newest for post2)

Post3: Comment 10 (newest for post3)

etc...

Any help would be greatly appreciated. Thanks.

Was it helpful?

Solution

This answer assumes that you have a unique identifier for each comment, and that it's an increasing number. That is, later posts have higher numbers than earlier posts. Doesn't have to be sequential, just have to be corresponding to order of input.

First, do a query that extracts the maximum comment id, grouped by post id.

Something like this:

SELECT MAX(ID) MaxCommentID, PostID
FROM Comments
GROUP BY PostID

This will give you a list of post id's, and the highest (latest) comment id for each one.

Then you join with this, to extract the rest of the data from the comments, for those id's.

SELECT C1.*, C2.PostID
FROM Comments AS C1
     INNER JOIN (
         SELECT MAX(ID) MaxCommentID, PostID
         FROM Comments
         GROUP BY PostID
     ) AS C2 ON C1.CommentID = C2.MaxCommentID

Then, you join with the posts, to get the information about those posts.

SELECT C1.*, P.*
FROM Comments AS C1
     INNER JOIN (
         SELECT MAX(ID) MaxCommentID, PostID
         FROM Comments
         GROUP BY PostID
     ) AS C2 ON C1.CommentID = C2.MaxCommentID
     INNER JOIN Posts AS P ON C2.PostID = P.ID

An alternate approach doesn't use the PostID of the inner query at all. First, pick out the maximum comment id for all unique posts, but don't care about which post, we know they're unique.

SELECT MAX(ID) AS MaxCommentID
FROM Comments
GROUP BY PostID

Then do an IN clause, to get the rest of the data for those comments:

SELECT C1.*
FROM Comments
WHERE C1.ID IN (
    SELECT MAX(ID) AS MaxCommentID
    FROM Comments
    GROUP BY PostID
)

Then simply join in the posts:

SELECT C1.*, P.*
FROM Comments AS C1
     INNER JOIN Posts AS P ON C1.PostID = P.ID
WHERE C1.ID IN (
    SELECT MAX(ID) AS MaxCommentID
    FROM Comments
    GROUP BY PostID
)

OTHER TIPS

Select the newest comment from a subquery

e.g

Select * 
from Posts po
Inner Join
(
Select CommentThread, CommentDate, CommentBody, Post from comments a
inner join 
(select commentthread, max(commentdate)
from comments b
group by commentthread)
on a.commentthread = b.commentthread
and a.commentdate = b.commentdate
) co
on po.Post = co.post
 select *
   from post
      , comments
  where post.post_id = comments.post_id
    and comments.comments_id = (select max(z.comments_id) from comments z where z.post_id = post.post_id)

And if you should still be stuck with an old mysql version, that doesn't know subqueries you can use something like

SELECT
  p.id, c1.id
FROM
  posts as p
LEFT JOIN
  comments as c1
ON
  p.id = c1.postId
LEFT JOIN
  comments as c2
ON
  c1.postId = c2.postId
  AND c1.id < c2.id
WHERE
  isnull(c2.id)
ORDER BY
  p.id
Either way, check your query with EXPLAIN for performance issues.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top