Question

I have a query using CONNECT BY and START WITH statement like below. The query in the IN clause took less than 5 seconds to run and returns 3000 rows. fact_table contains 20M of records. How can I optimize this SQL below because this query takes forever to run :(

SELECT DISTINCT CONNECT_BY_ROOT a.dst_ID_key AS root_ID_key, a.src_ID_key
  FROM fact_table a
  CONNECT BY NOCYCLE PRIOR a.src_ID_key = a.dst_ID_key
   START WITH a.dst_ID_key IN (SELECT b.ID_key
                           FROM TableA b
                           JOIN TableB c
                             ON (c.name = b.name AND c.school = b.school)
                          WHERE b.status = 'Active')
Was it helpful?

Solution

Using "in" is very slow.
If this data doesn't need real-time update.
I think you can try this:
1、Don't use distinct
2、Use table connection instead of 'in'
3、Use materialized view
Hope helps.Wait for a better answer too. : )

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top