Question

Been trying to build a method that gets all conceivable unique paths through a DAG. Went with recursion because it seemed like the easiest to understand. Ended up with this:

public class Brutus {
    //the previous nodes visited
    public ArrayList<Node> resultHistory = new ArrayList<Node>();
    //Directed Graph class, contains a HashMap [adjacencies]
    // that has keys for all nodes that contains all edges
    public AdjacencyList paths;
    //A list of all the pathways between nodes represented as a list of nodes
    public ArrayList<ArrayList<Node>> allPaths = new ArrayList<ArrayList<Node>>();

public Brutus(AdjacencyList paths) {
    this.paths = paths;
}

public ArrayList<ArrayList<Node>> findAll() {
    int counter = 1;
    for (Node key : paths.adjacencies.keySet()) {
        System.out.println("[" + counter + "]: " + key.toString());
        StringTokenizer st = new StringTokenizer(
            paths.getAdjacentString(key), ",");
        while (st.hasMoreTokens()) {
            String child = st.nextToken();
            if (paths.getNodeFromGraph(child) != null) {
                resultHistory = new ArrayList<Node>();
                resultHistory.add(key);
                findPath(child, resultHistory);
            }
        }
        counter++;
    }
    return allPaths;
}

public void findPath(String child, ArrayList<Node> resultHistory) {
    if (resultHistory.contains(paths.getNodeFromGraph(child))) {
        return;
    }
    resultHistory.add(paths.getNodeFromGraph(child));
    if(!(inList(resultHistory, allPaths))) {
        allPaths.add(resultHistory);
    }
    StringTokenizer st = new StringTokenizer(
        paths.getAdjacentString(paths.getNodeFromGraph(child)), ",");
    while (st.hasMoreTokens()) {
        child = st.nextToken();
        if (paths.getNodeFromGraph(child) != null) {
            findPath(child, resultHistory);
        }

    }
}

public boolean inList(ArrayList<Node> resultHistory,
    ArrayList<ArrayList<Node>> allPaths) {
    for (int i = 0; i < allPaths.size();i++) {
        if (allPaths.get(i).equals(resultHistory)) {
            return true;
        }
    }
    return false;
}

Problem is, I don't think it works for all paths, since I can't find certain paths inside it. Although as the dataset is 900 nodes, I am unable to find a pattern! Other questions on Stack seem to be somewhat more specialized and as such I attempted to build my own algorithm!

Can anyone either suggest a superior way to perform this, or tell me what I've done wrong? If the algorithms correct, what would be the best way to withdraw all the paths between two nodes?

EDIT: I now realize that new paths don't get created from child nodes of the original, how would I make it so it does?

Was it helpful?

Solution

Here there is an implementation based on the BFS algorithm. I will denote a path as a sequence of vertices l = (v, v', v'', ...) and I will perform the following two operations on it:

  • extend(l, v): puts vertex v at the end of list l;
  • v = back(l): gets the last vertex in list l.

FindPaths(G, v) {
  // The first path is, simply, the starting node.
  // It should be the first vertex in topological order.
  pending_paths = {(v)};
  while (pending_paths is not empty) {
    l = pending_paths.remove_first(); // pop the first pending path
    output(l); // output it (or save it in a list to be returned, if you prefer)
    v = back(l); // Get the last vertex of the path
    foreach(edge (v, v') in G) { // For each edge outgoing from v'...
      // extend l with v' and put into the list of paths to be examined.
      pending_paths.push_back(extend(l, v'));
    } 
  }
}

OTHER TIPS

Here's a simple recursive algorithm, expressed in pseudocode to avoid clouding the issue with lots of Java list manipulation:

AllPaths(currentNode):
  result = EmptyList()
  foreach child in children(node):
    subpaths = AllPaths(child)
    foreach subpath in subpaths:
      Append(result, currentNode + subpath)
  return result

Calling AllPaths on the root node will give you what you need, and you can improve the running time for nontrivial DAGs by caching the result of AllPaths on each node, so you only need to compute it once rather than once per distinct path that includes it.

So while @akappa's Pseudo was a good start, it took me awhile to understand how to make it work, if anyone else comes across this post here's how I did it:

public ArrayList<ArrayList<Node>> searchAll() {
    try {
        BufferedWriter out = new BufferedWriter(new FileWriter("output.txt"));
        //Gets Nodes from Hashmap and puts them into Queue
        for (Node key : paths.adjacencies.keySet()) {
            queue.addToQueue(new QueueItem(key.chemName, new ArrayList<Node>()));
        }
        while (queue.getSize() > 0) {
            QueueItem queueItem = queue.getFromQueue();
            Node node = paths.getNodeFromGraph(queueItem.getNodeId());
            if (node != null) {
                findRelationAll(node, queueItem, out);
            }
        }
        System.out.println("Cycle complete: Number of Edges: [" + resultHistoryAll.size() + "]");
        out.close();
    } catch (IOException e) {
    }
    return resultHistoryAll;

}

public void findRelationAll(Node node, QueueItem queueItem, BufferedWriter out) {
    if (!foundRelation) {
        StringTokenizer st = new StringTokenizer(paths.getAdjacentString(node), ",");
        while (st.hasMoreTokens()) {
            String child = st.nextToken();
            ArrayList<Node> history = new ArrayList<Node>();
            //Gets previous Nodes
            history.addAll(queueItem.getHistoryPath());
            //Makes sure path is unique
            if (history.contains(node)) {
                System.out.println("[" + counter2 + "]: Cyclic");
                counter2++;
                continue;
            }
            history.add(node);
            resultHistory = history;
            queue.addToQueue(new QueueItem(child, history));
            if (!(inList(resultHistory, resultHistoryAll))) {
                resultHistoryAll.add(resultHistory);
                try {
                    out.write("[" + counter + "]: " + resultHistory.toString());
                    out.newLine();
                    out.newLine();
                } catch (IOException e) {
                }
                System.out.println("[" + counter + "]: " + resultHistory.toString());
                counter++;
            } else {
                System.out.println("[" + counter3 + "]: InList");
                counter3++;
            }

        }
    }
}
//This checks path isn't in List already
public boolean inList(ArrayList<Node> resultHistory, ArrayList<ArrayList<Node>> allPaths) {
    for (int i = 0; i < allPaths.size(); i++) {
        if (allPaths.get(i).equals(resultHistory)) {
            return true;
        }
    }
    return false;
}

}

The above code does a few extra things that you might not want:

  • It writes pathways to a text file as a list of nodes + it's counter value.
  • Makes sure the path doesn't cross the same node twice
  • Makes sure no two pathways are the same in the final list (in normal circumstances this is unnecessary work)

The QueueItem object is just a way to store the previously visited nodes. It's part of nemanja's code, which is what my code was based off.

Hat tip to him, akappa (for the most efficient answer), and jacobm (for finding a solution like my original code, and explaining it's limitations).

Incase anyone's actually interested in the work; I'm currently processing over 5 million pathways, of which 60,000 are unique pathways between 900 chemicals. And that's just 1,2,3 or 4 chemical pathways... Biology is complicated.

EDIT and Warning: IF anyone is handling huge reams of data like me, windows 7 - or at least my machine - throws a shit fit and closes the program after an ArrayList > 63,000 objects, regardless of how you arrange the pointers. The solution I started with was after 60,000 objects, restarting the list while adding everything to CSV. This led to some duplicates between list iteration, and should ultimately be surpassed by my moving to linux tomorrow!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top