Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why path compression doesn't change rank in UnionFind?

I'm looking at the implementation of UnionFind with union by rank and path compression from here http://en.wikipedia.org/wiki/Disjoint-set_data_structure#Disjoint-set_forests (it's pretty much the same pseudo-code as in CLRS) and don't understand why path compression doesn't change rank. If we call find for an endpoint of the longest path from the root the rank should go down and if it doesn't the next union operation will choose an incorrect root.

like image 772
synapse Avatar asked Aug 14 '14 20:08

synapse


People also ask

What is the difference between Union by rank and path compression?

Union by rank ensures that the maximum depth of the tree is log N, so it puts a worst case upper bound of O (log N) on every operation. Path compression without any special union rules an upper bound of O (log N) on the amortized cost of each operation, but doesn't limit the worst case cost.

How does path compression affect the rank of a tree?

Because of path compression, rank is no longer depth of the tree (it becomes an upper bound on depth). Suppose T1 has 2 branches (let rank of T1 be 3), and T2 has depth 2 and rank 2. Now suppose we perform find operation (with path compression) on the leaf of T1 marked with "*" below.

What is Union by rank in data structures?

To implement union by rank, each element is associated with a rank. Initially a set has one element and a rank of zero. Both trees have the different ranks – the resulting set’s rank is the larger of the two. Ranks are used instead of height or depth because path compression will change the trees’ heights over time.

How can we optimize the Union by rank?

We can optimize it by using union by rank. Union by rank always attaches the shorter tree to the root of the taller tree. To implement union by rank, each element is associated with a rank. Initially a set has one element and a rank of zero. Both trees have the same rank – the resulting set’s rank is one larger


2 Answers

"Rank" is one of those horribly overloaded terms in theoretical computer science. As Wikipedia notes, in the context of that disjoint set data structure with path compression, rank is not an intrinsic property of the current topology of the forest -- there's just no good way to keep the height of each node up to date. As defined by the sequence of unions, however, rank is useful in proving the running time bound involving the inverse Ackermann function.

like image 114
David Eisenstat Avatar answered Nov 16 '22 01:11

David Eisenstat


Rank is not the actual depth of the tree rather it is an upper bound. As such, on a find operation, the rank is allowed to get out of sync with the depth.

like image 27
borrible Avatar answered Nov 16 '22 01:11

borrible