Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do we need to run DFS on the complement of a graph in the Kosaraju's algorithm?

There's a famous algorithm to find the strongly connected components called Kosaraju's algorithm, which uses two DFS's to solve this problem, and runs in θ(|V| + |E|) time.

First we use DFS on complement of the graph (GR) to compute reverse postorder of vertices, and then we apply second DFS on the main graph G by taking vertices in reverse post order to compute the strongly connected components.

Although I understand the mechanics of the algorithm, I'm not getting the intuition behind the need of the reverse post order.

How does it helps the second DFS to find the strongly connected components?

like image 709
Atinesh Avatar asked Jan 03 '14 10:01

Atinesh


2 Answers

suppose result of the first DFS is:

----------v1--------------v2-----------

where "-" indicates any number and all the vertices in a strongly connected component g appear between v1 and v2.

DFS by post order gives the following guarantee that

  • all vertices after v2 would not points to g in the reverse graph(that is to say, you cannot reach these vertices from g in the origin graph)
  • all vertices before v1 cannot be pointed to from g in the reverse graph(that is to say, you cannot reach g from these vertices in the origin graph)

in one word, the first DFS ensures that in the second DFS, strongly connected components that are visited earlier cannot have any edge points to other unvisited strongly connected components.

Some Detailed Explanation

let's simplify the graph as follow:

  • the whole graph is G
  • G contains two strongly connected components, one is g, the other one is a single vertex v
  • there is only one edge between v and g, either from v to g or from g to v, the name of this edge is e
  • g', e' represent the reverse of g, e

the situation in which this algorithm could fail can be conclude as

  1. start DFS from v, and e' points from v to g'
  2. start DFS from a vertex inside of g', and e' points from g' to v

For situation 1

origin graph would be like g-->v, and the reversed graph looks like g'<--v.

To start the second DFS from v, the post order generated by first DFS need to be something like

g1, g2, g3, ..., v

but you would easily find out that neither starting the first DFS from v nor from g' can give you such a post order, so in this situation, it is guaranteed be the first DFS that the second DFS would not start from a vertex that both be out of and points to a strongly connected component.

For situation 2

similar to the situation 1, in situation 2, where the origin graph is g<--v and the reversed on is g'-->v, it is guaranteed that v would be visited before any vertex in g'.

like image 52
zealoct Avatar answered Oct 27 '22 04:10

zealoct


  1. When you run DFS on a graph for the first time, for every node you visit you get the knowledge about all nodes that are reachable from that node (you get this information after the first DFS is finished).

  2. Then, when you inverse all the vertices and run the DFS once more, for every node you visit you get the knowledge about all nodes that can reach that node in the non-inverted graph (again, you get this info after the second DFS finished).

Example: let's say your first DFS reaches node X. From that node "you can see" all the neighbours you can visit. (I hope this is pretty understandable). Then, let's say your second DFS reaches that node X, but this time all the vertices are inverted. If then from your node X "you can see" any other nodes, it means that before inverting the vertices the node X was reachable from all the neighbours you see now. By calling the second DFS in the correct order you get for every node X all the nodes that where reachable from X in both DFS trees (and so, for every node X you get the nodes that were both reachable from X and could reach X - those are strongly connected components by definition).

like image 25
3yakuya Avatar answered Oct 27 '22 04:10

3yakuya