Question

I want to make a report of all the entries in a table where one column has duplicate entries. Let's assume we have a table like this:

customer_name     | some_number
Tom                 1
Steve               3
Chris               4
Tim                 3
...

I want to show all the records that have some_number as a duplicate. I have used a query like this to show all the duplicate records:

select customer_name, some_number from table where some_number in (select some_number from table group by some_number having count(*) > 1) order by some_number;

This works for a small table, but the one I actually need to operate on is fairly large. 30,000 + rows and it is taking FOREVER! Does someone have a better way to do this?

Thanks!

Was it helpful?

Solution

Try this query:

SELECT t1.*
FROM (SELECT some_number, COUNT(*) AS nb
       FROM your_table
       GROUP BY some_number
       HAVING nb>1
     ) t2, your_table t1
WHERE t1.some_number=t2.some_number

The query first uses GROUP BY to find duplicate records, then joins with the table to retrieve all fields.
Since HAVING is used, it will return only the records you are interested in, then do the join with your_table.

Be sure your table has an index on some_number if you want the query to be fast.

OTHER TIPS

Does this perform better? It joins on a table of some_number counts and then filters to include only those with a count > 1.

SELECT t.customer_name, t.some_number 
FROM my_table t
INNER JOIN (
  SELECT some_number, COUNT(*) AS ct
  FROM my_table
  GROUP BY some_number ) dup ON t.some_number = dup.some_number
 WHERE dup.ct > 1
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top