select non-duplicated records

https://stackoverflow.com/questions/20785518

21-09-2022
|

Question

I have a table with about 50 millions records.

the table structure is something like below and both callerid and call_start fields are indexed.

id -- callerid -- call_start

I want to select all records that their call_start is greater than '2013-12-22' and callerid is not duplicated before '2013-12-22' in whole table.

I used something like this:

SELECT DISTINCT 
  ca.`callerid` 
FROM
  call_archives AS ca 
WHERE ca.`call_start` >= '2013-12-22' 
  AND ca.`callerid` NOT IN 
  (SELECT DISTINCT 
    ca.`callerid` 
  FROM
    call_archives AS ca 
  WHERE ca.`call_start` < '2013-12-21')

but this is extremely slow, any suggestion is really appreciated.

Solution 2

Just curious if this query works fast or not on your table:

SELECT ca.`callerid` 
FROM call_archives 
GROUP BY ca.`callerid` 
HAVING MIN(ca.`call_start`) >='2013-12-22'

OTHER TIPS

Try with NOT EXISTS

SELECT DISTINCT 
  ca.`callerid` 
FROM
  call_archives AS ca 
WHERE ca.`call_start` >= '2013-12-22' 
  AND NOT EXISTS 
  (SELECT 
    1 
  FROM
    call_archives AS cb 
  WHERE ca.`callerid` = cb.`callerid` 
    AND cb.`call_start` < '2013-12-21')

Use NOT EXISTS instead of NOT IN

Try this:

SELECT DISTINCT ca.callerid 
FROM call_archives AS ca 
WHERE ca.call_start>='2013-12-22' AND 
  NOT EXISTS(SELECT 1 FROM call_archives AS ca1 
             WHERE ca.callerid = ca1.callerid AND ca1.call_start <'2013-12-21');

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow