So while @akappa's Pseudo was a good start, it took me awhile to understand how to make it work, if anyone else comes across this post here's how I did it:
public ArrayList<ArrayList<Node>> searchAll() {
try {
BufferedWriter out = new BufferedWriter(new FileWriter("output.txt"));
//Gets Nodes from Hashmap and puts them into Queue
for (Node key : paths.adjacencies.keySet()) {
queue.addToQueue(new QueueItem(key.chemName, new ArrayList<Node>()));
}
while (queue.getSize() > 0) {
QueueItem queueItem = queue.getFromQueue();
Node node = paths.getNodeFromGraph(queueItem.getNodeId());
if (node != null) {
findRelationAll(node, queueItem, out);
}
}
System.out.println("Cycle complete: Number of Edges: [" + resultHistoryAll.size() + "]");
out.close();
} catch (IOException e) {
}
return resultHistoryAll;
}
public void findRelationAll(Node node, QueueItem queueItem, BufferedWriter out) {
if (!foundRelation) {
StringTokenizer st = new StringTokenizer(paths.getAdjacentString(node), ",");
while (st.hasMoreTokens()) {
String child = st.nextToken();
ArrayList<Node> history = new ArrayList<Node>();
//Gets previous Nodes
history.addAll(queueItem.getHistoryPath());
//Makes sure path is unique
if (history.contains(node)) {
System.out.println("[" + counter2 + "]: Cyclic");
counter2++;
continue;
}
history.add(node);
resultHistory = history;
queue.addToQueue(new QueueItem(child, history));
if (!(inList(resultHistory, resultHistoryAll))) {
resultHistoryAll.add(resultHistory);
try {
out.write("[" + counter + "]: " + resultHistory.toString());
out.newLine();
out.newLine();
} catch (IOException e) {
}
System.out.println("[" + counter + "]: " + resultHistory.toString());
counter++;
} else {
System.out.println("[" + counter3 + "]: InList");
counter3++;
}
}
}
}
//This checks path isn't in List already
public boolean inList(ArrayList<Node> resultHistory, ArrayList<ArrayList<Node>> allPaths) {
for (int i = 0; i < allPaths.size(); i++) {
if (allPaths.get(i).equals(resultHistory)) {
return true;
}
}
return false;
}
}
The above code does a few extra things that you might not want:
- It writes pathways to a text file as a list of nodes + it's counter value.
- Makes sure the path doesn't cross the same node twice
- Makes sure no two pathways are the same in the final list (in normal circumstances this is unnecessary work)
The QueueItem object is just a way to store the previously visited nodes. It's part of nemanja's code, which is what my code was based off.
Hat tip to him, akappa (for the most efficient answer), and jacobm (for finding a solution like my original code, and explaining it's limitations).
Incase anyone's actually interested in the work; I'm currently processing over 5 million pathways, of which 60,000 are unique pathways between 900 chemicals. And that's just 1,2,3 or 4 chemical pathways... Biology is complicated.
EDIT and Warning: IF anyone is handling huge reams of data like me, windows 7 - or at least my machine - throws a shit fit and closes the program after an ArrayList > 63,000 objects, regardless of how you arrange the pointers. The solution I started with was after 60,000 objects, restarting the list while adding everything to CSV. This led to some duplicates between list iteration, and should ultimately be surpassed by my moving to linux tomorrow!