Question

I'm using MySQL 5.0, and I need to fine tune this query. Can anyone please tell me what tuning I can do in this?

SELECT DISTINCT(alert_master_id) FROM alert_appln_header 
WHERE created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
AND alert_master_id NOT IN (
SELECT DISTINCT(alert_master_id) FROM alert_details 
WHERE end_date IS NULL AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY) 
UNION
SELECT DISTINCT(alert_master_id) FROM alert_sara_header 
WHERE sara_master_id IN 
(SELECT alert_sara_master_id FROM alert_sara_lines 
WHERE end_date IS NULL) AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
) LIMIT 5000;
Was it helpful?

Solution

The first thing that I'd do is rewrite the subqueries as joins:

SELECT      h.alert_master_id

FROM        alert_appln_header h

       JOIN schedule_config c
         ON c.schedule_name = 'Purging_Config'

  LEFT JOIN alert_details d
         ON d.alert_master_id = h.alert_master_id
        AND d.end_date IS NULL
        AND d.created_date < CURRENT_DATE - INTERVAL c.parameters DAY

  LEFT JOIN (
              alert_sara_header s
         JOIN alert_sara_lines  l
           ON l.alert_sara_master_id = s.sara_master_id
            )
         ON s.alert_master_id = h.alert_master_id
        AND s.end_date IS NULL
        AND s.created_date < CURRENT_DATE - INTERVAL c.parameters DAY

WHERE       h.created_date < CURRENT_DATE - INTERVAL c.parameters DAY
        AND d.alert_master_id IS NULL
        AND s.alert_master_id IS NULL

GROUP BY    h.alert_master_id

LIMIT       5000

If it's still slow after that, re-examine your indexing strategy. I'd suggest indexes over:

  • alert_appln_header(alert_master_id,created_date)
  • schedule_config(schedule_name)
  • alert_details(alert_master_id,end_date,created_date)
  • alert_sara_header(sara_master_id,alert_master_id,end_date,created_date)
  • alert_sara_lines(alert_sara_master_id)

OTHER TIPS

OK, this may be just a shot in the dark, but I think you don't need as many DISTINCT here.

SELECT DISTINCT(alert_master_id) FROM alert_appln_header 
WHERE created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
AND alert_master_id NOT IN (
     -- removed distinct here --
    SELECT alert_master_id FROM alert_details 
    WHERE end_date IS NULL AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY) 
    UNION
     -- removed distinct here --
    SELECT alert_master_id FROM alert_sara_header 
    WHERE sara_master_id IN 
        (SELECT alert_sara_master_id FROM alert_sara_lines 
        WHERE end_date IS NULL) 
    AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
) LIMIT 5000;

Since using the DISTINCT is very costly, try to avoid it. In the first WHERE clause you are checking for ids that are NOT within some result, so it shouldn't matter if in that result some ids appear more than once.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top