Recently, I read this and was surprised that the time complexity of the union & find algorithm just with the path compression was O((m+n) log n)
, where m
is the number of 'find' queries and n
is the number of 'merge' queries.
I was interested in this complexity, (because I usually implement this algorithm without rank, and even when I applied union by rank upside down, the performance was not bad!) and tried to find an example which can make that time complexity. But because of the great power of path compression, it was really hard...
Is there any examples that can achieve Omega((m+n) log n)
, or is that complexity just theoretical, not practical?
Yes, there's a matching lower bound due to Michael J. Fischer in 1972 (see Section 3). His example uses a binomial tree of depth log n, where a binomial tree of depth 0 is a single node and a binomial tree of depth k is two binomial trees of depth k - 1 with the root of one pointing to the root of the other. By repeatedly taking the union of the root of the binomial tree to point to a singleton and doing a find on the deepest node, we perform an expensive (logarithmic steps) operation that leaves another embedded binomial tree to exploit.
Python demonstration: this prints (k+1) * 2**k
where k
is the argument to example
, representing an approximate operation count for O(2**k)
operations on O(2**k)
keys.
p = {}
def find(a):
global count
b = a
while (b in p):
count += 1
b = p[b]
while (a in p):
pa = p[a]
p[a] = b
a = pa
return b
def union(a, b):
p[find(a)] = find(b)
def deepest():
maxd = (0, None)
for (a, pa) in p.items():
d = 1
while (pa in p):
d += 1
pa = p[pa]
maxd = max(maxd, (d, a))
return maxd[1]
def example(k):
global count, p
count = 0
p.clear()
for a in range(((2 ** k) - 1), 0, (- 1)):
union(a, (a & (a - 1)))
r = 0
for a in range((2 ** k), (2 ** (k + 1))):
union(r, a)
find(deepest())
r = a
example(9)
print(count)
or is that complexity just theoretical, not practical?
Yes. The complexity of an algorithm is a purely theoretical construct to describe how well the algorithm scales for different input sizes (in this case, the numbers of finds and unions).
This does not give any guarantees for the number of steps required for a particular instance of input (say: 5
finds and 3
unions) - other than being finite, of course. In fact, the big O notation uses the concept of an arbitrarily large multiplicative constant, which does not help to calculate exact runtimes but is enough to distinguish algorithms in complexity classes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With