Does using union-find in Kruskal's algorithm actually affect worst-case runtime?

Tags:

So I'm teaching myself some graph algorithms, now on Kruskal's, and understand that it's recommended to use union-find so checking whether adding an edge creates a cycle only takes O(Log V) time. For practical purposes, I see why you'd want to, but strictly looking through Big O notation, does doing so actually affect the worst-case complexity?

My reasoning: If instead of union find, we did a DFS to check for cycles, the runtime for that would be O(E+V), and you have to perform that V times for a runtime of O(V^2 + VE). It's more than with union find, which would be O(V * LogV), but the bulk of the complexity of Kruskal's comes from deleting the minimum element of the priority queue E times, which is O(E * logE), the Big O answer. I don't really see a space advantage either since the union-find takes O(V) space and so too do the data structures you need to maintain to find a cycle using DFS.

So a probably overly long explanation for a simple question: Does using union-find in Kruskal's algorithm actually affect worst-case runtime?

900

asked Aug 16 '15 22:08

Chirayu Poudel

1 Answers

and understand that it's recommended to use union-find so checking whether adding an edge creates a cycle only takes O(Log V) time

This isn't right. Using union find is O(alpha(n) * m), where alpha(n) is the inverse of the Ackermann function, and, for all intents and purposes, can be considered constant. So much faster than logarithmic:

Since alpha(n) is the inverse of this function, alpha(n) is less than 5 for all remotely practical values of n. Thus, the amortized running time per operation is effectively a small constant.

but the bulk of the complexity of Kruskal's comes from deleting the minimum element of the priority queue E times

This is also wrong. Kruskal's algorithm does not involve using any priority queues. It involves sorting the edges by cost at the beginning. Although the complexity remains the one you mention for this step. However, sorting might be faster in practice than a priority queue (using a priority queue will, at best, be equivalent to a heap sort, which is not the fastest sorting algorithm).

Bottom line, if m is the number of edges and n the number of nodes.:

Sorting the edges: O(m log m).
For each edge, calling union-find: O(m * alpha(n)), or basically just O(m).
Total complexity: O(m log m + m * alpha(n)).
If you don't use union-find, total complexity will be O(m log m + m * (n + m)), if we use your O(n + m) cycle finding algorithm. Although O(n + m) for this step is probably an understatement, since you must also update your structure somehow (insert an edge). The naive disjoint-set algorithm is actually O(n log n), so even worse.

Note: in this case, you can write log n instead of log m if you prefer, because m = O(n^2) and log(n^2) = 2log n.

In conclusion: yes, union-find helps a lot.

Even if you use the O(log n) variant of union-find, which would lead to O(m log m + m log n) total complexity, which you could assimilate to O(m log m), in practice you'd rather keep the second part faster if you can. Since union-find is very easy to implement, there's really no reason not to.

130

answered Nov 15 '22 07:11

IVlad

Related questions
                            
                                Is it possible to calculate the number of count inversions using quicksort?
                            
                                Java Smooth Color Transition
                            
                                Reverse stack without using any data structure
                            
                                How to to determine the number of ways a number can be broken down into sums of smaller numbers
                            
                                Determine if a given string is a k-palindrome
                            
                                How to find the intersection of two NFA
                            
                                What sorting algorithm does visual c++ use in std::sort
                            
                                Count of squarefree numbers in range
                            
                                Efficient way to multiply a large set of small numbers
                            
                                write 0's and 1's on each line where the last 2 weren't the same
                            
                                Confused about "c lg n" in Introduction to Algorithms book - what is c?
                            
                                Time complexity of euclidean distance
                            
                                Find all elements that appear more than n/4 times in linear time
                            
                                Algorithm to Fill In a Closed 2D Curve [closed]
                            
                                Create combinations of numbers within list prolog
                            
                                Given 2 arrays, find the minimal sum of multiplication of indexes
                            
                                Which exponentiation algorithms do CPU/programming languages use? [closed]
                            
                                Find unsorted indices after using numpy.searchsorted
                            
                                Shortest path algorithm using dictionaries [Python]
                            
                                Number of all combinations in Knapsack task

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Does using union-find in Kruskal's algorithm actually affect worst-case runtime?

Tags:

algorithm

time-complexity

data-structures

graph-algorithm

Chirayu Poudel

People also ask

1 Answers

IVlad

Recent Activity

Donate For Us