Question

I have the following tables:

Table 1 : Contacts
Fields : id  first_name

Values : 

1    Reeta
2    Rohan
3    John 

Table 2 : email (it contains contact_id of contacts table)
Fields : id   contact_id  email_address

Values :

1  1  r@gmail.com
2  2  r@gmail.com
3  3  j@gmail.com

I want to display all duplicates by email. Like this:

cont_id  first_name  email_address

   1        Reeta  r@gmail.com
   2        Rohan  r@gmail.com

Here is my query :

select contact_id 
from contacts 
where email_address IN (
  SELECT S.email_address 
  FROM  contacts R
  INNER JOIN email
    ON R.id = S.contact_id 
  Group By email_address
  Having Count(S.id) > 1
); 

The query takes long time to execute with large number of records. However the inner query works faster but not the outer one. Please Help.

Was it helpful?

Solution

Yes, it is the problem with MySQL query optimizer. The thing is - IN (subquery) for MySQL is same thing as = ANY () subquery. And MySQL won't optimize that even if subquery returns few count of values which in normal situation will be easily compared via index (if it exists on column for which we're doing IN).

Do not confuse that with IN (<static values list>) - it is completely different case, when MySQL treats IN not as specific type of subquery, but as comparison operator - and, therefore, will use index.

Therefore, one of possible solutions is - separate your subquery so it will return some list of values, then substitute that list to IN (), getting index scan because of treating IN as comparison operator. However, in some cases solution may be got via replacing subquery with JOIN. It's not possible in all cases, thus solution, described above, will fit for common non-trivial case.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top