Question

Take a look at the MySQL table below called "Articles":

+----+-----------+---------+------------------------+--------------------------+
| id | articleId | version | title                  | content                  |
+----+-----------+---------+------------------------+--------------------------+
|  1 |         1 | 0.0     | ArticleNo.1 title v0.0 | ArticleNo.1 content v0.0 |
|  2 |         1 | 1.0     | ArticleNo.1 title v1.0 | ArticleNo.1 content v1.0 |
|  3 |         1 | 1.5     | ArticleNo.1 title v1.5 | ArticleNo.1 content v1.5 |
|  4 |         1 | 2.0     | ArticleNo.1 title v2.0 | ArticleNo.1 content v2.0 |
|  5 |         2 | 1.0     | ArticleNo.2 title v1.0 | ArticleNo.2 content v1.0 |
|  6 |         2 | 2.0     | ArticleNo.2 title v2.0 | ArticleNo.2 content v2.0 |
+----+-----------+---------+------------------------+--------------------------+

Im trying to come up with a query to return Articles.id where Articles.version is the maximum number.

The actual Articles table contains over 10,000 entries.

So in this example I ONLY want Articles.id 4 and 6 to be returned. Ive been looking at keyword distinct and function max() but cant seem to nail it.

Any suggestions appreciated...

Was it helpful?

Solution

You need a sub query here:

SELECT a.id, a.version
FROM articles a
WHERE a.version = (
    SELECT MAX(version)
    FROM articles b
    WHERE b.articleId = a.articleId
)

OTHER TIPS

From the MySQL manual, this does the trick:

SELECT a1.id
FROM Articles a1
LEFT JOIN Articles a2 ON a1.articleId = a2.articleId AND a1.version < a2.version
WHERE a2.articleId IS NULL;

From the linked manual:

The LEFT JOIN works on the basis that when s1.price a1.version is at its maximum minimum value, there is no s2.price a2.version with a greater lesser value and thus the corresponding s2.article a2.articleId value is NULL.

(I edited to make their text match your situation.)

You can use a subquery here:

select
    a.id
from
    articles a
where
    a.version = (select max(version) from articles)

Since this isn't a correlated subquery, it will be pretty fast (as fast or faster than a join), so you don't need to worry about that. But it'll return all of the values that have the max articleid.

Of course, I'd recommend that you throw an index on articleid (assuming id is already clustered). That will help it return even faster.

As noted in the comments, if you want to get the max version for each articleid, you can do the following:

select
    a.id
from
    articles a
    inner join (select articleid, max(version) 
                from articles group by articleid) as b on
        a.articleid = b.articleid
        and a.version = b.version

This will create a join out of that subquery and typically be much faster than a correlated subquery that selects the max for each articleid.

SELECT Articles.id FROM Articles WHERE Articles.version IN (SELECT MAX(Articles.version) FROM Articles)

maybe something like:

select * from articles where articleid in (select max(articleversion) from articles)

You can do

SELECT articles.id
  FROM articles a
 WHERE NOT EXISTS(SELECT * FROM articles a2 WHERE a2.version > a.version)

I don't know MySQL will perform with it, but Oracle and SQL server are fine with it.

All the above answers may solve DJDonaL3000 's stated problem. But, most of them are correlated sub queries which hinder the performance of the application. If you are using a cloud application then, I suggest not to use the above queries. The best way then is to maintain two tables, A and B. A containing always the latest version of the article and B having details of all the versions. The issue is there will be more updates and insertions into the tables, but the fetch will be faster as you just need to execute a normal select query in table A. Doing group by and max in an inner query will cost you

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top