How should I find the rows with a duplicate field in a big table?

https://stackoverflow.com/questions/18294933

24-06-2022
|

質問

I have a table with 1.5M+ rows for recording downloads from a website which has email address of the one who has downloaded something. I want to find those who have downloaded more than 100 times. This is what I have tested but the query-time is more than 11 seconds when I test it on the server! Do you know any faster way?

SELECT  `email`
FROM  `table_of_downloads` 
GROUP BY  `email` 
HAVING COUNT( * ) >100

Here is the EXPLAIN results as requested:

id  select_type table   type    possible_keys   key key_len ref rows    Extra
1   SIMPLE  table_of_downloads  ALL NULL    NULL    NULL    NULL    1656546 Using temporary; Using filesort

解決 2

For others to know, I just changed the type from tinytext to varchar(128) and the query time went down to 0.03 seconds.

他のヒント

You need to have an index on the email column. Otherwise, the query has to scan the entire table to count the number of rows for each email. There's no way to make it faster other than with an index.

ライセンス： CC-BY-SA と帰属

所属していません StackOverflow