Question

Using CONNECT BY LEVEL seems to return too many rows when performed on a table. What is the logic behind what's happening?

Assuming the following table:

create table a ( id number );

insert into a values (1);
insert into a values (2);
insert into a values (3);

This query returns 12 rows (SQL Fiddle).

 select id, level as lvl
   from a
connect by level <= 2
  order by id, level

One row for each in table A with the value of column LVL being 1 and three for each in table A where the column LVL is 2, i.e.:

ID | LVL 
---+-----
 1 |  1 
 1 |  2 
 1 |  2 
 1 |  2 
 2 |  1 
 2 |  2 
 2 |  2 
 2 |  2 
 3 |  1 
 3 |  2 
 3 |  2 
 3 |  2 

It is equivalent to this query, which returns the same results.

 select id, level as lvl
   from dual
  cross join a
connect by level <= 2
  order by id, level

I don't understand why these queries return 12 rows or why there are three rows where LVL is 2 and only one where LVL is 1 for each value of the ID column.

Increasing the number of levels that are "connected" to 3 returns 13 rows for each value of ID. 1 where LVL is 1, 3 where LVL is 2 and 9 where LVL is 3. This seems to suggest that the rows returned are the number of rows in table A to the power of the value of LVL minus 1.

I would have though that these queries would be the same as the following, which returns 6 rows

select id, lvl
  from ( select level  as lvl
           from dual
        connect by level  <= 2
                )
 cross join a
 order by id, lvl

The documentation isn't particularly clear, to me, in explaining what should occur. What's happening with these powers and why aren't the first two queries the same as the third?

Was it helpful?

Solution

In the first query, you connect by just the level. So if level <= 1, you get each of the records 1 time. If level <= 2, then you get each level 1 time (for level 1) + N times (where N is the number of records in the table). It is like you are cross joining, because you're just picking all records from the table until the level is reached, without having other conditions to limit the result. For level <= 3, this is done again for each of those results.

So for 3 records:

  • Lvl 1: 3 record (all having level 1)
  • Lvl 2: 3 records having level 1 + 3*3 records having level 2 = 12
  • Lvl 3: 3 + 3*3 + 3*3*3 = 39 (indeed, 13 records each).
  • Lvl 4: starting to see a pattern? :)

It's not really a cross join. A cross join would only return those records that have level 2 in this query result, while with this connect by, you get the records having level 1 as well as the records having level 2, thus resulting in 3 + 3*3 instead of just 3*3 record.

OTHER TIPS

When connect by is used without start with clause and prior operator, there is no restriction on joining children row to a parent row. And what Oracle does in this situation, it returns all possible hierarchy permutations by connecting a row to every row of level higher.

SQL> select b
  2       , level as lvl
  3       , sys_connect_by_path(b, '->') as ph
  4     from a
  5  connect by level <= 2
  6  ;

         B        LVL PH
       ---------- ---------- 
         1          1 ->1
         1          2 ->1->1
         2          2 ->1->2
         3          2 ->1->3
         2          1 ->2
         1          2 ->2->1
         2          2 ->2->2
         3          2 ->2->3
         3          1 ->3
         1          2 ->3->1
         2          2 ->3->2
         3          2 ->3->3

12 rows selected

you're comparing apples to oranges when comparing the final query to the others as the LEVEL is isolated in that to the 1-row dual table.

lets consider this query:

 select id, level as lvl
   from a
connect by level <= 2
  order by id, level

what that is saying is, start with the table set (select * From a). then, for each row returned connect this row to the prior row. as you have not defined a join in the connect by, this is in effect a Cartesian join, so when you have 3 rows of (1,2,3) 1 joins to 2, 1->3, 2->1, 2->3, 3->1 and 3->2 and they also join to themselves 1->1,2->2 and 3->3. these joins are level=2. so we have 9 joins there, which is why you get 12 rows (3 original "level 1" rows plus the Cartesian set).

so the number of rows output = rowcount + (rowcount^2)

in the last query you are isolating level to this

select level  as lvl
           from dual
        connect by level  <= 2

which of course returns 2 rows. this is then cartesianed to the original 3 rows, giving 6 rows as output.

You can use technique below to overcome this issue:

select id, level as lvl
   from a
      left outer join (select level l from dual connect by level <= 2) lev on 1 = 1
order by id
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top