Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Worst-case O(n) algorithm for doing k-selection

Apart from the median-of-medians algorithm, is there any other way to do k-selection in worst-case O(n) time? Does implementing median-of-medians make sense; I mean, is the performance advantage good enough for practical purposes ?

like image 436
Harman Avatar asked Sep 09 '11 08:09

Harman


People also ask

What is the best and worst case time complexity of quick select?

Time Complexity Analysis: The time complexity for the average case for quick select is O(n) (reduced from O(nlogn) — quick sort). The worst case time complexity is still O(n²) but by using a random pivot, the worst case can be avoided in most cases.

Why is quickselect O n on average?

As in quick sort, we have to do partition in halves *, and then in halves of a half, but this time, we only need to do the next round partition in one single partition (half) of the two where the element is expected to lie in. So it is O(n).

What is the average and worst case time complexity of quicksort?

The worst case time complexity of a typical implementation of QuickSort is O(n2). The worst case occurs when the picked pivot is always an extreme (smallest or largest) element. This happens when input array is sorted or reverse sorted and either first or last element is picked as pivot.

What kind of algorithm is quickselect?

In computer science, quickselect is a selection algorithm to find the kth smallest element in an unordered list. It is related to the quicksort sorting algorithm. Like quicksort, it was developed by Tony Hoare, and thus is also known as Hoare's selection algorithm.


1 Answers

There is another algorithm for computing kth order statistics based on the soft heap data structure, which is a variant on a standard priority queue that is allowed to "corrupt" some number of the priorities it stores. The algorithm is described in more detail on the Wikipedia article, but the basic idea is to use the soft heap to efficiently (O(n) time) pick a pivot for the partition function that has a guarantee of a good split. In a sense, this is simply a modified version of the median-of-medians algorithm that uses an (arguably) more straightforward approach to choosing the pivot element.

Soft heaps are not particularly intuitive, but there is a pretty good description of them available in this paper ("A simpler implementation and analysis of Chazelle's soft heaps"), which includes a formal description and analysis if the data structure.

However, if you want a really fast, worst-case O(n) algorithm, consider looking into introselect. This algorithm is actually quite brilliant. It starts off by using the quickselect algorithm, which picks a pivot unintelligently and uses it to partition the data. This is extremely fast in practice, but has bad worst-case behavior. Introselect fixes this by keeping track of an internal counter that tracks its progress. If the algorithm ever looks like it's about to degrade to O(n2) time, it switches algorithms and uses something like median-of-medians to ensure the worst-case guarantee. Specifically, it watches how much of the array is discarded at each step, and if some constant number of steps occur before half the input is discarded, the algorithm switches to the median-of-medians algorithm to ensure that the next pivot is good before then restarting using quickselect. This guarantees worst-case O(n) time.

The advantage of this algorithm is that it's extremely fast on most inputs (since quickselect is very fast), but has great worst-case behavior. A description of this algorithm, along with the related sorting algorithm introsort, can be found in this paper ("Introspective Sorting and Selection Algorithms").

Hope this helps!

like image 108
templatetypedef Avatar answered Sep 20 '22 19:09

templatetypedef