Question

Good day, I have a question I'm struggling with a lot, hope somebody already found a clever solution to this (I use MySQL).

I have table like this:

Table `log`
----------
id
inserted
message
user_id

My goal is to select last inserted record for user and make this fast. Log table is huge (around 900k records), so my first approach was:

SELECT * FROM `log` 
LEFT JOIN `users` ON `users`.`id` = `log`.`user_id`
WHERE `id` IN 
(
 SELECT MAX(`id`) FROM `log` GROUP BY `user_id`
)

But it seems it calculate subquery for every row (EXPLAIN shows DEPENDENT QUERY). When I split this query for two:

SELECT MAX(`id`) FROM `log` GROUP BY `user_id`

and

SELECT * FROM `log` 
LEFT JOIN `users` ON `users`.`id` = `log`.`user_id`
WHERE `id` IN (....ids from first query...)

It is acceptable to run. Can this be achived by one query?

Was it helpful?

Solution

In addition to using group by to fetch the group-wise maximum you probably want to make it an uncorrelated subquery to fetch additional fields for the specific rows from the table.

SELECT
  la.user_id,la.message
FROM
  `log` as la
INNER JOIN
  (
    SELECT
      user_id, MAX(id) AS maxid
    FROM
      `log`
    GROUP BY
      user_id
    ) as lb
ON
  la.id = lb.maxid

This works best/fastest if you have an index

KEY `foo` (`user_id`,`id`)

but even without that key the performance is descent.

OTHER TIPS

How about

SELECT user_id, max(id) FROM `log` GROUP BY user_id

?

This will get you the maximum id for every user in the log table, all in one query!

If you always are looking for the log for a particular user, partitioning the log file by user_id would speed things up a lot. If the table is partitioned by user and indexed by id, the query will run very fast.

EDIT: see Dominik's query

In addition, I would make sure you have an index on user_id.

EDIT: generalized

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top