Worst-case O(n) algorithm for doing k-selection

Tags:

Apart from the median-of-medians algorithm, is there any other way to do k-selection in worst-case O(n) time? Does implementing median-of-medians make sense; I mean, is the performance advantage good enough for practical purposes ?

436

asked Sep 09 '11 08:09

Harman

1 Answers

There is another algorithm for computing kth order statistics based on the soft heap data structure, which is a variant on a standard priority queue that is allowed to "corrupt" some number of the priorities it stores. The algorithm is described in more detail on the Wikipedia article, but the basic idea is to use the soft heap to efficiently (O(n) time) pick a pivot for the partition function that has a guarantee of a good split. In a sense, this is simply a modified version of the median-of-medians algorithm that uses an (arguably) more straightforward approach to choosing the pivot element.

Soft heaps are not particularly intuitive, but there is a pretty good description of them available in this paper ("A simpler implementation and analysis of Chazelle's soft heaps"), which includes a formal description and analysis if the data structure.

However, if you want a really fast, worst-case O(n) algorithm, consider looking into introselect. This algorithm is actually quite brilliant. It starts off by using the quickselect algorithm, which picks a pivot unintelligently and uses it to partition the data. This is extremely fast in practice, but has bad worst-case behavior. Introselect fixes this by keeping track of an internal counter that tracks its progress. If the algorithm ever looks like it's about to degrade to O(n²) time, it switches algorithms and uses something like median-of-medians to ensure the worst-case guarantee. Specifically, it watches how much of the array is discarded at each step, and if some constant number of steps occur before half the input is discarded, the algorithm switches to the median-of-medians algorithm to ensure that the next pivot is good before then restarting using quickselect. This guarantees worst-case O(n) time.

The advantage of this algorithm is that it's extremely fast on most inputs (since quickselect is very fast), but has great worst-case behavior. A description of this algorithm, along with the related sorting algorithm introsort, can be found in this paper ("Introspective Sorting and Selection Algorithms").

Hope this helps!

108

answered Sep 20 '22 19:09

templatetypedef

Related questions
                            
                                Is there an STL algorithm to find if an element is present in a container based on some tolerance?
                            
                                Algorithm for matching position and size of two rectangles
                            
                                How can I multiply two hex 128 bit numbers in assembly
                            
                                Big O Notation - O(nlog(n)) vs O(log(n^2))
                            
                                Iteration counter for double loops
                            
                                How to change the key in an unordered_map?
                            
                                Algorithm to create a polygon from points [closed]
                            
                                How to validate a Singaporean FIN?
                            
                                Finding the path with the maximum minimal weight
                            
                                Number of Comparisons using merge sort
                            
                                Where to learn about enemy game algorithms (like Starcraft/Warcraft)? [closed]
                            
                                Design a datastructure to support stack operations and to find minimum
                            
                                Algorithm for finding the best routes for food distribution in game
                            
                                Anticipate factorial overflow
                            
                                A function where small changes in input always result in large changes in output
                            
                                Draw a point a set distance away from a base point
                            
                                Fastest safe sorting algorithm implementation
                            
                                Big O effeciency for multiple variables
                            
                                Shortest distance travel - common meeting point
                            
                                Algorithm for matching strings between two large files

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Worst-case O(n) algorithm for doing k-selection

Tags:

algorithm

big-o

selection

median-of-medians

Harman

People also ask

1 Answers

templatetypedef

Recent Activity

Donate For Us