Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Union/find algorithm without union by rank for disjoint-set forests data structure

Here's a breakdown on the union/find algorithm for disjoint set forests on wikipedia:

  • Barebone disjoint-set forests... (O(n))
    • ... with union by rank ... (now improved to O(log(n))
      • ... with path compression (now improved to O(a(n)), effectively O(1))

Implementing union by rank necessitates that each node keeps a rank field for comparison purposes. My question is, is union by rank worth this additional space? What happens if I skip union by rank and just do path compression instead? Is it good enough? What is the amortized complexity now?


A comment is made that implies that union by rank without path compression (amortized O(log(n) complexity) is sufficient for most practical application. This is correct. What I'm asking is the other way around: what if you skip union by rank and ONLY do path compression instead?

In a sense, path compression is an extra step to improve union by rank, and that's why that extra step can be omitted without disastrous consequence. But is union by rank a necessary intermediate step to path compression? Can I skip it and go straight to path compression, or will that be catastrophic?


It was also pointed out that without union by rank, repeated unions could create a linked-list like structure. This means that a single path compression operation could take O(n) in the worst case. This would of course affect future operations, so how this plays out when amortized over many operations is what I'm more interested in.

like image 631
polygenelubricants Avatar asked Feb 24 '10 02:02

polygenelubricants


People also ask

What is the rank in union-find?

The idea of UNION BY RANK is to ensure that when we combine two trees, we try to keep the overall depth of the resulting tree small. This is implemented as follows: the rank of an element x is initialized to 0 by MAKESET. An element's rank is only updated by the LINK operation.

How many find set and union operations are done on disjoint-set data structure in Kruskal's algorithm?

This can be done in O(E*log(E)) = O(E*log(V^2)) = O(E*2*log(V)) = O(E*log(V)) time. Then iterates through the edges and executes O(E) disjoint-set data structure operations ( 2 find & possibly 1 union on every iteration). The complexity of the operations depends on the disjoint-set implementation.

Which algorithm uses union-find?

In the Kruskal's Algorithm, Union Find Data Structure is used as a subroutine to find the cycles in the graph, which helps in finding the minimum spanning tree.

Which data structure could be used to implement the disjoint sets data structure?

One way to implement disjoint set data structures is to represent each set by a linked list. Each element (object) will be in a linked list and will contain a pointer to the next element in the set and another pointer to the representative of the set.


1 Answers

I googled for "without union by rank" and the second link that came up was this one:

...We close this section with an analysis of union–find with path compression but without union by rank...

The union-find datastructure with path compression but without union by rank processes m find and n-1 link operations in time O((m+n) log n)

like image 181
jkff Avatar answered Sep 29 '22 20:09

jkff