kosaraju finding finishing time using iterative dfs

Question

here is the first part of the code that i have did for Kosaraju's algorithm.

###### reading the data #####
with open('data.txt') as req_file:
        ori_data = []
        for line in req_file:
            line = line.split()
            if line:
                line = [int(i) for i in line]
                ori_data.append(line)

###### forming the Grev ####
revscc_dic = {}
for temp in ori_data:
    if temp[1] not in revscc_dic:
        revscc_dic[temp[1]] = [temp[0]]
    else:
        revscc_dic[temp[1]].append(temp[0])

print revscc_dic        

######## finding the G#####
scc_dic = {}
for temp in ori_data:
    if temp[0] not in scc_dic:
        scc_dic[temp[0]] = [temp[1]]
    else:
        scc_dic[temp[0]].append(temp[1])

print scc_dic        

##### iterative dfs ####
path = []
for i in range(max(max(ori_data)),0,-1):
    start = i
    q=[start]
    while q:
        v=q.pop(0)
        if v not in path:
          path.append(v)
          q=revscc_dic[v]+q
print path

The code reads the data and forms Grev and G correctly. I have written a code for iterative dfs. How can i include to find the finishing time ?? I understand finding the finishing time using paper and pen but I do not understand the part of finishing time as a code ?? how can I implement it.. Only after this I can proceed my next part of code. Pls help. Thanks in advance.

The data.txt file contains:

please save it as data.txt.

James Lawson · Accepted Answer

With recursive dfs, it is easy to see when a given vertex has "finished" (i.e. when we have visited all of its children in the dfs tree). The finish time can be calculated just after the recursive call has returned.
However with iterative dfs, this is not so easy. Now that we are iteratively processing the queue using a while loop we have lost some of the nested structure that is associated with function calls. Or more precisely, we don't know when backtracking occurs. Unfortunately, there is no way to know when backtracking occurs without adding some additional information to our stack of vertices.

The quickest way to add finishing times to your dfs implementation is like so:

##### iterative dfs (with finish times) ####
path = []
time = 0
finish_time_dic = {}
for i in range(max(max(ori_data)),0,-1):
    start = i
    q = [start]
    while q:
        v = q.pop(0)
        if v not in path:
            path.append(v)
            q = [v] + q
            for w in revscc_dic[v]:
                if w not in path: q = [w] + q
        else:
            if v not in finish_time_dic:
                finish_time_dic[v] = time
                time += 1
print path  
print finish_time_dic

The trick used here is that when we pop off v from the stack, if it is the first time we have seen it, then we add it back to the stack again. This is done using: q = [v] + q. We must push v onto the stack before we push on its neighbours (we write the code that pushes v before the for loop that pushes v's neighbours) - or else the trick doesn't work. Eventually we will pop v off the stack again. At this point, v has finished! We have seen v before, so, we go into the else case and compute a fresh finish time.

For the graph provided, finish_time_dic gives the correct finishing times:

{1: 6, 2: 1, 3: 3, 4: 7, 5: 0, 6: 4, 7: 8, 8: 2, 9: 5}

Note that this dfs algorithm (with the finishing times modification) still has O(V+E) complexity, despite the fact that we are pushing each node of the graph onto the stack twice. However, more elegant solutions exist. I recommend reading Chapter 5 of Python Algorithms: Mastering Basic Algorithms in the Python Language by Magnus Lie Hetland (ISBN: 1430232374, 9781430232377). Question 5-6 and 5-7 (on page 122) describe your problem exactly. The author answers these questions and gives an alternate solution to the problem.

Questions:

5-6 In recursive DFS, backtracking occurs when you return from one of the recursive calls. But where has the backtracking gone in the iterative version?

5-7. Write a nonrecursive version of DFS that can deal determine finish-times.

Answers:

5-6 It’s not really represented at all in the iterative version. It just implicitly occurs once you’ve popped off all your “traversal descendants” from the stack.

5-7 As explained in Exercise 5-6, there is no point in the code where backtracking occurs in the iterative DFS, so we can’t just set the finish time at some specific place (like in the recursive one). Instead, we’d need to add a marker to the stack. For example, instead of adding the neighbors of u to the stack, we could add edges of the form (u, v), and before all of them, we’d push (u, None), indicating the backtracking point for u.

R Buragohain · Answer

Here are the recursive and iterative implementations in java:

int time = 0;
public void dfsRecursive(Vertex vertex) {
        time += 1;
        vertex.setVisited(true);
        vertex.setDiscovered(time);
        for (String neighbour : vertex.getNeighbours()) {
            if (!vertices.get(neighbour).getVisited()) {
                dfsRecursive(vertices.get(neighbour));
            }
        }
        time += 1;
        vertex.setFinished(time);
    }

    public void dfsIterative(Vertex vertex) {
        Stack<Vertex> stack = new Stack<>();
        stack.push(vertex);
        while (!stack.isEmpty()) {
            Vertex current = stack.pop();
            if (!current.getVisited()) {
                time += 1;
                current.setVisited(true);
                current.setDiscovered(time);
                stack.push(current);
                List<String> currentsNeigbours = current.getNeighbours();
                for (int i = currentsNeigbours.size() - 1; i >= 0; i--) {
                    String currentNeigbour = currentsNeigbours.get(i);
                    Vertex neighBour = vertices.get(currentNeigbour);
                    if (!neighBour.getVisited())
                        stack.push(neighBour);
                }
            } else {
                if (current.getFinished() < 1) {
                    time += 1;
                    current.setFinished(time);
                }
            }
        }
    }

raychi · Answer

Iterative DFS itself is not complicated, as seen from Wikipedia. However, calculating the finish time of each node requires some tweaks to the algorithm. We only pop the node off the stack the 2nd time we encounter it.

Here's my implementation which I feel demonstrates what's going on a bit more clearly:

step = 0  # time counter

def dfs_visit(g, v):
    """Run iterative DFS from node V"""
    global step
    total = 0
    stack = [v]  # create stack with starting vertex
    while stack:  # while stack is not empty
        step += 1
        v = stack[-1]  # peek top of stack
        if v.color:  # if already seen
            v = stack.pop()  # done with this node, pop it from stack
            if v.color == 1:  # if GRAY, finish this node
                v.time_finish = step
                v.color = 2  # BLACK, done
        else:  # seen for first time
            v.color = 1  # GRAY: discovered
            v.time_discover = step
            total += 1
            for w in v.child:  # for all neighbor (v, w)
                if not w.color:  # if not seen
                    stack.append(w)
    return total

def dfs(g):
    """Run DFS on graph"""
    global step
    step = 0  # reset step counter
    for k, v in g.nodes.items():
        if not v.color:
            dfs_visit(g, v)

I am following the conventions of the CLR Algorithm Book and use node coloring to designate its state during the DFS search. I feel this is easier to understand than using a separate list to track node state.

All nodes start out as white. When it's discovered during the search it is marked as gray. When we are done with it, it is marked as black.

Within the while loop, if a node is white we keep it in the stack, and change its color to gray. If it's gray we change its color to black, and set its finish time. If it's black we just ignore it.

It is possible for a node on the stack to be black (even with our coloring check before adding it to the stack). A white node can be added to the stack twice (via two different neighbors). One will eventually turn black. When we reach the 2nd instance on the stack, we need to make sure we don't change its already set finish time.

Here are some additional support codes:

class Node(object):
    def __init__(self, name=None):
        self.name = name
        self.child = []  # children | adjacency list
        self.color = 0  # 0: white [unvisited], 1: gray [found], 2: black [finished]
        self.time_discover = None  # DFS
        self.time_finish = None  # DFS

class Graph(object):
    def __init__(self):
        self.nodes = defaultdict(Node)  # list of Nodes
        self.max_heap = []  # nodes in decreasing finish time for SCC

    def build_max_heap(self):
        """Build list of nodes in max heap using DFS finish time"""
        for k, v in self.nodes.items():
            self.max_heap.append((0-v.time_finish, v))  # invert finish time for max heap
        heapq.heapify(self.max_heap)

To run DFS on the reverse graph, you can build a parent list similar to the child list for each Node when the edges file is processed, and use the parent list instead of the child list in dfs_visit().

To process Nodes in decreasing finish time for the last part of SCC computation, you can build a max heap of Nodes, and use that max heap in dfs_visit() instead of simply the child list.

    while g.max_heap:
        v = heapq.heappop(g.max_heap)[1]
        if not v.color:
           size = dfs_visit(g, v)
           scc_size.append(size)

drajc · Answer

I had a few issues with the order produced by Lawson's version of the iterative DFS. Here is code for my version which has a 1-to-1 mapping with a recursive version of DFS.

n = len(graph)
time = 0
finish_times = [0] * (n + 1)
explored = [False] * (n + 1)

# Determine if every vertex connected to v
# has already been explored
def all_explored(G, v):
    if v in G:
        for w in G[v]:
            if not explored[w]:
                return False
    return True

# Loop through vertices in reverse order
for v in xrange(n, 0, -1):
    if not explored[v]:
        stack = [v]
        while stack:
            print(stack)
            v = stack[-1]
            explored[v] = True

            # If v still has outgoing edges to explore
            if not all_explored(graph_reversed, v):
                for w in graph_reversed[v]:

                    # Explore w before others attached to v
                    if not explored[w]:
                        stack.append(w)
                        break

            # We have explored vertices findable from v
            else:
                stack.pop()
                time += 1
                finish_times[v] = time

kosaraju finding finishing time using iterative dfs

Tags:

python

algorithm

kosaraju-sharir

user 3317704

4 Answers

James Lawson

R Buragohain

raychi

drajc

Recent Activity

Donate For Us

kosaraju finding finishing time using iterative dfs

Tags:

python

algorithm

kosaraju-sharir

user 3317704

4 Answers

James Lawson

R Buragohain

raychi

drajc

Related questions

Recent Activity

Donate For Us