Domanda

I'm working on optimizing the below query (takes 1:15 minutes now), but I better explain first what I'm doing (long):

I'm selecting (Developer Express) calendar events in a period tt_actualstart - tt_actualfinish for an employee identified by tt_emp_id.
These events are characterized by tt_type values:

0: single event<br />
1: master recurring event<br />
other: exceptions on the recurring events.
  (These have a `tt_parentid` referencing `tt_calendar_id` of the master event)

The 4 'OR'-ed query parts are:

  1. Single and master events in the period for the employee
  2. All exceptions of which the master meets the requirements of 1
  3. Exceptions in the period with master events (anywhere) belonging to the employee
  4. The master events belonging to the exceptions found in 3.
    This query part is the one taking most time.

Schematically:

Legend:   |-------------------| = selection period
          CAPITALS = selected event
          lowercase = related event

1.                               MASTER
                          |-------------------|

2.        EXCEPTION              master
                          |-------------------|

3.        master                EXCEPTION
                          |-------------------|

4.        MASTER                exception                  
                          |-------------------|

There is one complication that influenced these queries: I cannot select on exceptions having the required employee ID. This is because the Developer Express code does not store appointment properties of exceptions when these are equal to the properties of the master parent. Selection on employee ID can therefore only be done on the master parent.

As you can see, the second OR-ed query uses the first one as subquery; likewise for 4 and 3.

select c.*
from tt_calendar c
where 
( 
   ( 
      (c.tt_type in (0,1)) and (c.tt_actualstart <= '2014-03-31') and (c.tt_actualfinish >= '2014-01-01') and (c.tt_emp_id= 20652)
   )

     or 

   (  
      (c.tt_type > 1) and (c.tt_parentid is not null) and  
      (c.tt_parentid in (select c2.tt_calendar_id from tt_calendar c2
                         where (c2.tt_calendar_id=c.tt_parentid) and (c2.tt_actualstart <= '2014-03-31') and (c2.tt_actualfinish >=  '2014-01-01') and (c2.tt_emp_id=20652)
                         )
      )
   )

     or

   (  
      (c.tt_type > 1) and (c.tt_parentid is not null) and  
      (c.tt_actualstart <= '2014-12-31') and (c.tt_actualfinish >= '2014-01-01') and
      (c.tt_parentid in (select c2.tt_calendar_id from tt_calendar c2 where c2.tt_emp_id=20652))
   )

     or

   (  
      (c.tt_type = 1) and
      (c.tt_calendar_id in 
        (select tt_parentid from tt_calendar c2 
         where (c2.tt_type > 1) and (c2.tt_parentid is not null) and  
               (c2.tt_actualstart <=  '2014-03-31') and (c2.tt_actualfinish >= '2014-01-01') and (c2.tt_emp_id=20652)))
   )
)

Adding indices on tt_actualstart and tt_actualfinish gave a 15% speed increase. The (FireBird) query plan shows that these indices are indeed being used.
There are no other indices except for the primary key tt_calendar_id.

One other thing I have tried is using an integer TAG field which gets set by 4 individual UPDATE queries, then selecting the TAGged records.
This actually increases execution time a little.

I'm not sure converting the IN to EXISTS is useful - does Firebird/SQL/Oracle (I need it for all three) already optimize that?

Any more idea on how to improve this query?

Added 17 feb 2014:

There is one outlier case in my test data that all answers so far miss, and that my (heavy) 4th query finds:

(not found) Master ID 27274 for recurring event that has only two occurrences: 10 Dec 2013 - 11 Dec 2013
with
(found) Exception 27275 where the occurrence was moved (from 11 Dec 2013) to 11 Jan 2014 (tt_parent_id=27274)
In this case the occurrence was moved into the selection period, but the master is not in there. I need to find the master too.

BTW The four result sets have some overlap, so in all cases is should be UNION instead of UNION ALL, but that is not clear from my question.

È stato utile?

Soluzione

select c.*
from tt_calendar c

where 
   ( 
      (c.tt_type in (0,1)) and (c.tt_actualstart <= '2014-12-31') and (c.tt_actualfinish >= '2014-01-01') and (c.tt_emp_id= 20652)
   )

UNION

select c.*
from tt_calendar c 
inner join tt_calendar c2 on c2.tt_calendar_id=c.tt_parentid
where
   (  
      (c.tt_type > 1) and (c.tt_parentid is not null) and  
       (c2.tt_actualstart <= '2014-12-31') and (c2.tt_actualfinish >= '2014-01-01') and (c2.tt_emp_id=20652)
   )

UNION

select c.*
from tt_calendar c 
inner join tt_calendar c2 on c2.tt_calendar_id=c.tt_parentid
where
   (  
      (c.tt_type > 1) and (c.tt_parentid is not null) and  
      (c.tt_actualstart <= '2014-12-31') and (c.tt_actualfinish >= '2014-01-01') and (c2.tt_emp_id=20652)
   )

UNION

select c.*
from tt_calendar c 
inner join tt_calendar c2 on c.tt_calendar_id=c2.tt_parentid
where

   (  
      (c.tt_type = 1) and (c.tt_emp_id=20652) and
      (c2.tt_type > 1) and (c2.tt_parentid is not null) and  
      (c2.tt_actualstart <=  '2014-12-31') and (c2.tt_actualfinish >= '2014-01-01')
   )
)
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top