SQL Query Question: X has many Y. Get all X and get only the newest Y per X
Question
Suppose we have two tables. Post and Comment. Post has many Comments. Pretend they are somewhat filled so that the number of comments per post is varied. I want a query which will grab all posts but only the newest comment per post.
I have been directed to joins and sub queries but I can't figure it out.
Example Output:
Post1: Comment4 (newest for post1)
Post2: Comment2 (newest for post2)
Post3: Comment 10 (newest for post3)
etc...
Any help would be greatly appreciated. Thanks.
Solution
This answer assumes that you have a unique identifier for each comment, and that it's an increasing number. That is, later posts have higher numbers than earlier posts. Doesn't have to be sequential, just have to be corresponding to order of input.
First, do a query that extracts the maximum comment id, grouped by post id.
Something like this:
SELECT MAX(ID) MaxCommentID, PostID
FROM Comments
GROUP BY PostID
This will give you a list of post id's, and the highest (latest) comment id for each one.
Then you join with this, to extract the rest of the data from the comments, for those id's.
SELECT C1.*, C2.PostID
FROM Comments AS C1
INNER JOIN (
SELECT MAX(ID) MaxCommentID, PostID
FROM Comments
GROUP BY PostID
) AS C2 ON C1.CommentID = C2.MaxCommentID
Then, you join with the posts, to get the information about those posts.
SELECT C1.*, P.*
FROM Comments AS C1
INNER JOIN (
SELECT MAX(ID) MaxCommentID, PostID
FROM Comments
GROUP BY PostID
) AS C2 ON C1.CommentID = C2.MaxCommentID
INNER JOIN Posts AS P ON C2.PostID = P.ID
An alternate approach doesn't use the PostID of the inner query at all. First, pick out the maximum comment id for all unique posts, but don't care about which post, we know they're unique.
SELECT MAX(ID) AS MaxCommentID
FROM Comments
GROUP BY PostID
Then do an IN clause, to get the rest of the data for those comments:
SELECT C1.*
FROM Comments
WHERE C1.ID IN (
SELECT MAX(ID) AS MaxCommentID
FROM Comments
GROUP BY PostID
)
Then simply join in the posts:
SELECT C1.*, P.*
FROM Comments AS C1
INNER JOIN Posts AS P ON C1.PostID = P.ID
WHERE C1.ID IN (
SELECT MAX(ID) AS MaxCommentID
FROM Comments
GROUP BY PostID
)
OTHER TIPS
Select the newest comment from a subquery
e.g
Select *
from Posts po
Inner Join
(
Select CommentThread, CommentDate, CommentBody, Post from comments a
inner join
(select commentthread, max(commentdate)
from comments b
group by commentthread)
on a.commentthread = b.commentthread
and a.commentdate = b.commentdate
) co
on po.Post = co.post
select *
from post
, comments
where post.post_id = comments.post_id
and comments.comments_id = (select max(z.comments_id) from comments z where z.post_id = post.post_id)
And if you should still be stuck with an old mysql version, that doesn't know subqueries you can use something like
SELECT p.id, c1.id FROM posts as p LEFT JOIN comments as c1 ON p.id = c1.postId LEFT JOIN comments as c2 ON c1.postId = c2.postId AND c1.id < c2.id WHERE isnull(c2.id) ORDER BY p.idEither way, check your query with EXPLAIN for performance issues.