Question

Consider the following directed graph:

I'd like to be able to pick two nodes in this graph, and then find all of the possible paths between those two nodes. (Well, not all of the paths, just those paths not containing cycles.)

First, I model the graph in a view (as a transitive closure)

CREATE OR REPLACE VIEW closure AS 
WITH 
edges AS (
    -- for demonstration purposes, I just need to model the edges
    SELECT 'A' AS start_node, 'C' AS end_node FROM dual  UNION ALL
    SELECT 'A', 'D' FROM dual  UNION ALL
    SELECT 'B', 'C' FROM dual  UNION ALL
    SELECT 'B', 'D' FROM dual  UNION ALL
    SELECT 'C', 'E' FROM dual  UNION ALL
    SELECT 'D', 'F' FROM dual  UNION ALL
    SELECT 'E', 'A' FROM dual  UNION ALL
    SELECT 'E', 'F' FROM dual  UNION ALL
    SELECT 'E', 'G' FROM dual  UNION ALL
    SELECT 'F', 'B' FROM dual  UNION ALL
    SELECT 'F', 'G' FROM dual  
)
SELECT 
    CONNECT_BY_ROOT start_node                                          AS start_node,    
    end_node                                                            AS end_node,
    CONNECT_BY_ROOT start_node||sys_connect_by_path(end_node, ' -> ')   AS path,
    level + 1                                                           AS path_length
FROM 
    edges
CONNECT BY 
    NOCYCLE PRIOR end_node = start_node

-- this bit just adds the trivial paths to the closure
UNION ALL
SELECT DISTINCT start_node, start_node, start_node, 1 FROM edges
;

view CLOSURE created.

Now, suppose that I wanted to see all of the paths that start at 'E' and end at 'G'.

SELECT path, path_length 
FROM closure 
WHERE start_node='E' 
  AND end_node = 'G'
ORDER BY path_length ASC
;

PATH                            PATH_LENGTH
------------------------------  -----------
E -> G                                    2 
E -> F -> G                               3 
E -> A -> C -> E -> G                     5 
E -> A -> D -> F -> G                     5 
E -> F -> B -> C -> E -> G                6 
E -> A -> C -> E -> F -> G                6 
E -> A -> D -> F -> B -> C -> E -> G      8 

 7 rows selected 

I am surprised the the third, fifth, sixth, and seventh rows are returned. These four rows all contain a cycle (specifically, from 'E' to 'E').

Looking at the data in my view, it appears that the only cycles that are included are those that include the root node. (e.g., the cycle 'E' -> 'A' -> 'C' -> 'E') is only included in paths that begin with node 'E'... it is never included in paths that begin with any other node.)

I have tried to prevent these cycles from happening by adding a CONNECT_BY_ROOT start_node <> start_node in my "CONNECT BY" clause, but this is not allowed by Oracle.

How can I prevent these cycles from occurring in my data?

==== BEGIN EDIT ====

As @Egor Skriptunoff pointed out below, I am working with a table of edges, so maybe I should expect to see some vertices repeated, as long as they are not visited over the same edge. However, if this were the case, I should see the path 'E -> F -> B -> D -> F -> G' in the results above. This is a path that cycles through node 'F', once via edge 'E -> F' and once via 'D -> F'. However, this path is not there.

I presume this is because the NOCYCLE is preventing node 'F' from occurring twice. Is this really the case?

Oracle version: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

==== END EDIT ====

Was it helpful?

Solution

Your select clause is flawed - you have to apply CONNECT_BY_ROOT to the head of the edge instead of its tail

-- ...
CONNECT_BY_ROOT end_node                                  AS start_node,    
end_node                                                  AS end_node,
sys_connect_by_path(end_node, ' -> ')                     AS path,
level                                                     AS path_length
-- ...

OTHER TIPS

You are working with a table of edges, so it is repeated edges that are excluded, vertices may repeat.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top