How to implement depth first search for graph with a non-recursive approach

Question 1

A DFS without recursion is basically the same as BFS - but use a stack instead of a queue as the data structure.

The thread Iterative DFS vs Recursive DFS and different elements order handles with both approaches and the difference between them (and there is! you will not traverse the nodes in the same order!)

The algorithm for the iterative approach is basically:

DFS(source):
  s <- new stack
  visited <- {} // empty set
  s.push(source)
  while (s is not empty):
    current <- s.pop()
    if (current is in visited):
        continue
    visited.add(current)
    // do something with current
    for each node v such that (current,v) is an edge:
        s.push(v)

Question 2

This is not an answer, but an extended comment, showing the application of the algorithm in @amit's answer to the graph in the current version of the question, assuming 1 is the start node and its neighbors are pushed in the order 2, 4, 3:

               1
             / |  \
            4  |   2
               3 /

Actions            Stack             Visited
=======            =====             =======
push 1             [1]               {}
pop and visit 1    []                {1}
 push 2, 4, 3      [2, 4, 3]         {1}
pop and visit 3    [2, 4]            {1, 3}
 push 1, 2         [2, 4, 1, 2]      {1, 3}
pop and visit 2    [2, 4, 1]         {1, 3, 2}
 push 1, 3         [2, 4, 1, 1, 3]   {1, 3, 2}
pop 3 (visited)    [2, 4, 1, 1]      {1, 3, 2}
pop 1 (visited)    [2, 4, 1]         {1, 3, 2}
pop 1 (visited)    [2, 4]            {1, 3, 2}
pop and visit 4    [2]               {1, 3, 2, 4}
  push 1           [2, 1]            {1, 3, 2, 4}
pop 1 (visited)    [2]               {1, 3, 2, 4}
pop 2 (visited)    []                {1, 3, 2, 4}

Thus applying the algorithm pushing 1's neighbors in the order 2, 4, 3 results in visit order 1, 3, 2, 4. Regardless of the push order for 1's neighbors, 2 and 3 will be adjacent in the visit order because whichever is visited first will push the other, which is not yet visited, as well as 1 which has been visited.

Question 3

The DFS logic should be:

1) if the current node is not visited, visit the node and mark it as visited
2) for all its neighbors that haven't been visited, push them to the stack

For example, let's define a GraphNode class in Java:

class GraphNode {
    int index;
    ArrayList<GraphNode> neighbors;
}

and here is the DFS without recursion:

void dfs(GraphNode node) {
    // sanity check
    if (node == null) {
        return;
    }

    // use a hash set to mark visited nodes
    Set<GraphNode> set = new HashSet<GraphNode>();

    // use a stack to help depth-first traversal
    Stack<GraphNode> stack = new Stack<GraphNode>();
    stack.push(node);

    while (!stack.isEmpty()) {
        GraphNode curr = stack.pop();

        // current node has not been visited yet
        if (!set.contains(curr)) {
            // visit the node
            // ...

            // mark it as visited
            set.add(curr);
        }

        for (int i = 0; i < curr.neighbors.size(); i++) {
            GraphNode neighbor = curr.neighbors.get(i);

            // this neighbor has not been visited yet
            if (!set.contains(neighbor)) {
                stack.push(neighbor);
            }
        }
    }
}

We can use the same logic to do DFS recursively, clone graph etc.

Question 4

Many people will say that non-recursive DFS is just BFS with a stack rather than a queue. That's not accurate, let me explain a bit more.

Recursive DFS

Recursive DFS uses the call stack to keep state, meaning you do not manage a separate stack yourself.

However, for a large graph, recursive DFS (or any recursive function that is) may result in a deep recursion, which can crash your problem with a stack overflow (not this website, the real thing).

Non-recursive DFS

DFS is not the same as BFS. It has a different space utilization, but if you implement it just like BFS, but using a stack rather than a queue, you will use more space than non-recursive DFS.

Why more space?

Consider this:

// From non-recursive "DFS"
for (auto i&: adjacent) {
    if (!visited(i)) {
        stack.push(i);
    }
}

And compare it with this:

// From recursive DFS
for (auto i&: adjacent) {
    if (!visited(i)) {
        dfs(i);
    }
}

In the first piece of code you are putting all the adjacent nodes in the stack before iterating to the next adjacent vertex and that has a space cost. If the graph is large it can make a significant difference.

What to do then?

If you decide to solve the space problem by iterating over the adjacency list again after popping the stack, that's going to add time complexity cost.

One solution is to add items to the stack one by one, as you visit them. To achieve this you can save an iterator in the stack to resume the iteration after popping.

Lazy way

In C/C++, a lazy approach is to compile your program with a larger stack size and increase stack size via ulimit, but that's really lousy. In Java you can set the stack size as a JVM parameter.

Question 5

Recursion is a way to use the call stack to store the state of the graph traversal. You can use the stack explicitly, say by having a local variable of type std::stack, then you won't need the recursion to implement the DFS, but just a loop.

Question 6

okay. if you are still looking for a java code

dfs(Vertex start){
    Stack<Vertex> stack = new Stack<>(); // initialize a stack
    List<Vertex> visited = new ArrayList<>();//maintains order of visited nodes
    stack.push(start); // push the start
    while(!stack.isEmpty()){ //check if stack is empty
        Vertex popped = stack.pop(); // pop the top of the stack
        if(!visited.contains(popped)){ //backtrack if the vertex is already visited
            visited.add(popped); //mark it as visited as it is not yet visited
            for(Vertex adjacent: popped.getAdjacents()){ //get the adjacents of the vertex as add them to the stack
                    stack.add(adjacent);
            }
        }
    }

    for(Vertex v1 : visited){
        System.out.println(v1.getId());
    }
}

Question 7

Python code. The time complexity is O(V+E) where V and E are the number of vertices and edges respectively. The space complexity is O(V) due to the worst-case where there is a path that contains every vertex without any backtracking (i.e. the search path is a linear chain).

The stack stores tuples of the form (vertex, vertex_edge_index) so that the DFS can be resumed from a particular vertex at the edge immediately following the last edge that was processed from that vertex (just like the function call stack of a recursive DFS).

The example code uses a complete digraph where every vertex is connected to every other vertex. Hence it is not necessary to store an explicit edge list for each node, as the graph is an edge list (the graph G contains every vertex).

numv = 1000
print('vertices =', numv)
G = [Vertex(i) for i in range(numv)]

def dfs(source):
  s = []
  visited = set()
  s.append((source,None))
  time = 1
  space = 0
  while s:
    time += 1
    current, index = s.pop()
    if index is None:
      visited.add(current)
      index = 0
    # vertex has all edges possible: G is a complete graph
    while index < len(G) and G[index] in visited:
      index += 1
    if index < len(G):
      s.append((current,index+1))
      s.append((G[index], None))
    space = max(space, len(s))
  print('time =', time, '\nspace =', space)

dfs(G[0])

Output:

time = 2000 
space = 1000

Note that time here is measuring V operations and not E. The value is numv*2 because every vertex is considered twice, once on discovery and once on finishing.

Question 8

Acutally, stack is not well able to deal with discover time and finish time, if we want to implement DFS with stack, and want to deal with discover time and finish time, we would need to resort to another recorder stack, my implementation is shown below, have test correct, below is for case-1, case-2 and case-3 graph.

from collections import defaultdict

class Graph(object):

    adj_list = defaultdict(list)

    def __init__(self, V):
        self.V = V

    def add_edge(self,u,v):
        self.adj_list[u].append(v)

    def DFS(self):
        visited = []
        instack = []
        disc = []
        fini = []
        for t in range(self.V):
            visited.append(0)
            disc.append(0)
            fini.append(0)
            instack.append(0)

        time = 0
        for u_ in range(self.V):
            if (visited[u_] != 1):
                stack = []
                stack_recorder = []
                stack.append(u_)
                while stack:
                    u = stack.pop()
                    visited[u] = 1
                    time+=1
                    disc[u] = time
                    print(u)
                    stack_recorder.append(u)
                    flag = 0
                    for v in self.adj_list[u]:
                        if (visited[v] != 1):
                            flag = 1
                            if instack[v]==0:
                                stack.append(v)
                            instack[v]= 1



                    if flag == 0:
                        time+=1
                        temp = stack_recorder.pop()
                        fini[temp] = time
                while stack_recorder:
                    temp = stack_recorder.pop()
                    time+=1
                    fini[temp] = time
        print(disc)
        print(fini)

if __name__ == '__main__':

    V = 6
    G = Graph(V)

#==============================================================================
# #for case 1
#     G.add_edge(0,1)
#     G.add_edge(0,2)
#     G.add_edge(1,3)
#     G.add_edge(2,1)
#     G.add_edge(3,2) 
#==============================================================================

#==============================================================================
# #for case 2
#     G.add_edge(0,1)
#     G.add_edge(0,2)
#     G.add_edge(1,3)
#     G.add_edge(3,2)  
#==============================================================================

#for case 3
    G.add_edge(0,3)    
    G.add_edge(0,1)

    G.add_edge(1,4)
    G.add_edge(2,4)
    G.add_edge(2,5)
    G.add_edge(3,1)
    G.add_edge(4,3)
    G.add_edge(5,5)    


    G.DFS()

Question 9

I think you need to use a visited[n] boolean array to check if the current node is visited or not earlier.

Question 10

A recursive algorithm works very well for DFS as we try to plunge as deeply as we can, ie. as soon as we find an un-explored vertex, we're going to explore its FIRST un-explored neighbor right away. You need to BREAK out of the for loop as soon as you find the first un-explored neighbor.

for each neighbor w of v
   if w is not explored
       mark w as explored
       push w onto the stack
       BREAK out of the for loop

Question 11

I think this is an optimized DFS regarding space-correct me if I am wrong.

s = stack
s.push(initial node)
add initial node to visited
while s is not empty:
    v = s.peek()
    if for all E(v,u) there is one unvisited u:
        mark u as visited
        s.push(u)
    else 
        s.pop

Question 12

Using Stack and implementing as done by the call stack in the recursion process-

The Idea is to push a vertex in the stack, and then push its vertex adjacent to it which is stored in a adjacency list at the index of the vertex and then continue this process until we cannot move further in the graph, now if we cannot move ahead in the graph then we will remove the vertex which is currently on the top of the stack as it is unable to take us on any vertex which is unvisited.

Now, using stack we take care of the point that the vertex is only removed from the stack when all the vertices that can be explored from the current vertex have been visited, which was being done by the recursion process automatically.

for ex -

See the example graph here.

( 0 ( 1 ( 2 ( 4 4 ) 2 ) ( 3 3 ) 1 ) 0 ) ( 6 ( 5 5 ) ( 7 7 ) 6 )

The above parenthesis show the order in which the vertex is added on the stack and removed from the stack, so a parenthesis for a vertex is closed only when all the vertices that can be visited from it have been done.

(Here I have used the Adjacency List representation and implemented as a vector of list (vector > AdjList) by using C++ STL)

void DFSUsingStack() {
    /// we keep a check of the vertices visited, the vector is set to false for all vertices initially.
    vector<bool> visited(AdjList.size(), false);

    stack<int> st;

    for(int i=0 ; i<AdjList.size() ; i++){
        if(visited[i] == true){
            continue;
        }
        st.push(i);
        cout << i << '\n';
        visited[i] = true;
        while(!st.empty()){
            int curr = st.top();
            for(list<int> :: iterator it = AdjList[curr].begin() ; it != AdjList[curr].end() ; it++){
                if(visited[*it] == false){
                    st.push(*it);
                    cout << (*it) << '\n';
                    visited[*it] = true;
                    break;
                }
            }
            /// We can move ahead from current only if a new vertex has been added on the top of the stack.
            if(st.top() != curr){
                continue;
            }
            st.pop();
        }
    }
}

Question 13

The following Java Code will be handy:-

private void DFS(int v,boolean[] visited){
    visited[v]=true;
    Stack<Integer> S = new Stack<Integer>();
    S.push(v);
    while(!S.isEmpty()){
        int v1=S.pop();     
        System.out.println(adjLists.get(v1).name);
        for(Neighbor nbr=adjLists.get(v1).adjList; nbr != null; nbr=nbr.next){
             if (!visited[nbr.VertexNum]){
                 visited[nbr.VertexNum]=true;
                 S.push(nbr.VertexNum);
             }
        }
    }
}
public void dfs() {
    boolean[] visited = new boolean[adjLists.size()];
    for (int v=0; v < visited.length; v++) {
        if (!visited[v])/*This condition is for Unconnected Vertices*/ {

            System.out.println("\nSTARTING AT " + adjLists.get(v).name);
            DFS(v, visited);
        }
    }
}