Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding cycles: DFS versus union-find?

DFS with coloring would take O(V+E) vs union find would take O(ElogV) reference: http://www.geeksforgeeks.org/detect-cycle-undirected-graph/

So union find approach is slower. If V = 100, E = 100, DFS = 200, Union find is 1,000. Is there a reason to use Union find? I personally like it because it produces a clean code. Or anything I missed that union find is better in real practice?

like image 795
codereviewanskquestions Avatar asked Jul 24 '17 03:07

codereviewanskquestions


People also ask

Is union-find faster than DFS?

If the graph is already in memory in adjacency list format, then DFS is slightly simpler and faster (O(n) versus O(n alpha(n)), where alpha(n) is inverse Ackermann), but union-find can handle the edges arriving online in any order, which is sometimes useful (e.g., there are too many to fit in main memory).

How can you look for cycles in a graph using the union-find data structure?

3)Algorithm to find cycle using Union-Find 1)Create disjoint-sets for each of the vertices in the graph. -If u and v both belong to the same disjoint-set then cycle exists in the graph. -Else merge the disjoint-sets in which u and v are present.

Can we use DFS to detect cycle?

Use DFS from every unvisited node. Depth First Traversal can be used to detect a cycle in a Graph.

Can we use union-find to detect cycle in directed graph?

No, we cannot use union-find to detect cycles in a directed graph. This is because a directed graph cannot be represented using the disjoint-set(the data structure on which union-find is performed).


1 Answers

I suspect that you may be misinterpreting how big-O notation works. The notation O(V + E) doesn't mean "the runtime is computed by adding V and E), but rather "the runtime scales as a function of the sum of V and E)." For example, suppose you run DFS on a graph with 1,000 nodes and 1,000 edges and the runtime is 1ms. It's then reasonable to assume that the runtime on a graph with 2,000 nodes and 2,000 edges would be something like 2ms. However, the big-O notation by itself won't tell you what the runtime will be on some given input if you don't have some reference point established. The 1ms figure I gave here is a total guess - you'd have to run the implementation to see what runtime you get.

Similarly, the runtime O(E log V) means "the runtime scales as the product of the number of nodes and the logarithm of the number of edges.) For example, if the runtime on an input with 1,000 nodes and 1,000 edges is 1ms, then the runtime on an input with 1,000 nodes and 2,000 edges would likely be 2ms, and the runtime on an input with 1,000,000 nodes and 1,000 edges would similarly be around 2ms. Again, the only way to find out what the runtime would be on some initial input would be to run it and see what happens.

One other detail - as many other folks have pointed out, the bound given on the union find data structure is for a very inefficient union-find structure. Using a disjoint set forest with path compression and union-by-rank, you could get an asymptotic runtime of O(α(n)) per operation, where α(n) is an extremely slowly-growing function (the Ackerman inverse function) that's essentially 5 for all inputs you could fit into the universe.

With that having been said - the asymptotic runtime of DFS is better than that for the union-find approach, so it's likely to be the faster one in practice. DFS is also relatively easy to implement, so I'd recommend going for that approach.

The advantage of the union-find structure is that it's good for the incremental version of the connectivity problem where edges are continuously being added in. DFS doesn't gracefully handle this case very well.

like image 175
templatetypedef Avatar answered Sep 29 '22 12:09

templatetypedef