Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding Strongly Connected Components in a graph through DFS

I was reading the graph algorithms about BFS and DFS. When I was analyzing the algorithm for finding strongly connected component in a Graph through DFS, a doubt came to my mind. For finding the strongly connected component, what book(Coremen)does, first it ran the DFS on the Graph in order to get the finish time of the vertices then again ran the DFS on the transpose of the graph in decreasing order of the finish time which we got from the first DFS. But I am not able to grasp why the second DFS must be run according to finish time. What I mean is that even if we directly run the DFS (ignoring the finish time) on the transpose of the graph, could it also have given us the connected components because by doing the transpose we have already blocked the path to other components.

like image 841
Manish Avatar asked Dec 07 '12 14:12

Manish


1 Answers

Edit- Here's some good in-depth videos from stanford university on the topic:

http://openclassroom.stanford.edu/MainFolder/CoursePage.php?course=IntroToAlgorithms (See 6. CONNECTIVITY IN DIRECTED GRAPHS)

My explanation:

It's possible that you would incorrectly identify the entire graph as a single strongly connected component(SCC) if you don't run the second dfs according to decreasing finish times of the first dfs.

enter image description here

Notice that in my example, node d would always have the lowest finish time from the first dfs. One of nodes a, b, or c will have the highest finish times. Lets assume a has the highest finish time, and so if we ran the second dfs according to decreasing finish times, a would be first.

Now, if you ran the second dfs starting with node d in the transpose of G, you would produce a depth first forest containing the entire graph, and so conclude that the entire graph is a SCC, which is clearly false. However, if you start the dfs with a, then you would not only discover a, b, and c, as being an SCC, but the important part is that they would be marked as visited by being colored grey or black. Then when you continue the dfs on d, you wouldn't traverse out of its SCC because you would realize that its adjacent nodes have been visited.

If you look at cormens code for DFS,

DFS(G)
1 for each vertex u in G.V
2     u.color = WHITE
3     u.π = NIL
4 time = 0
5 for each vertex u in G.V
6     if u.color == WHITE
7         DFS-VISIT(G, u)

DFS-VISIT(G, u)
1 time = time + 1 // white vertex u has just been discovered
2 u.d = time
3 u.color = GRAY
4 for each v in G.adj[u]
5     if v.color == WHITE
6         v.π = u
7         DFS-VISIT(G, u)
8 u.color = BLACK // blacken u; it is finished
9 time = time + 1
10 u.f = time

if you didn't use decreasing finish time, then line 6 of DFS would only be true once, because DFS-VISIT would visit the entire graph recursively. This produces a single tree in the depth first forest, and each tree is an SCC. The reasoning for a single tree is because a tree is identified by its root node having a nil predecessor.

like image 115
goat Avatar answered Sep 30 '22 18:09

goat