Get rid of slow subquery

https://stackoverflow.com/questions/23490025

16-07-2023
|

Pergunta

I have a database of connections to my server, currently with 2,300,000 rows.

I run the following query to find connections by a user, removing duplicates if the nick/ip/client_id are the same.

SELECT
    `nick`,
    INET_NTOA(`ip`) as `ip`,
    HEX(`client_id`) as `client_id`,
    UNIX_TIMESTAMP(`date`) as `date`
FROM
    (SELECT * FROM `joins` ORDER BY `date` DESC) as `sub`
WHERE
    `nick` LIKE '%nick%'
    -- Can also be things like this:
    --   `ip` & INET_ATON('255.255.0.0') = INET_ATON('123.123.0.0')
GROUP BY
    `nick`,
    `ip`,
    `client_id`
ORDER BY
    `date` DESC
LIMIT 500

Why do I use a subquery in the first place? To get the latest date value when using GROUP BY.

Solução

I think you've misunderstood the role of ORDER BY and GROUP BY in this query. In order to get the latest date per nick,ip,client_id you would write the query as follows:

SELECT
    `nick`,
    INET_NTOA(`ip`) as `ip`,
    HEX(`client_id`) as `client_id`,
    MAX(UNIX_TIMESTAMP(`date`)) as `date`
FROM
`joins`
WHERE
    `nick` LIKE '%nick%'
    -- Can also be things like this:
    --   `ip` & INET_ATON('255.255.0.0') = INET_ATON('123.123.0.0')
GROUP BY
    `nick`,
    `ip`,
    `client_id`
ORDER BY
    `date` DESC
LIMIT 500

There is no need for a subquery at all. This code groups the data and then returns the maximum value of

UNIX_TIMESTAMP(`date`)

as date.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow