Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Explanation of the Median of Medians algorithm

The Median of medians approach is very popular in quicksort type partitioning algorithms to yield a fairly good pivot, such that it partitions the array uniformly. Its logic is given in Wikipedia as:

The chosen pivot is both less than and greater than half of the elements in the list of medians, which is around n/10 elements (1/2 * (n/5)) for each half. Each of these elements is a median of 5, making it less than 2 other elements and greater than 2 other elements outside the block. Hence, the pivot is less than 3(n/10) elements outside the block, and greater than another 3(n/10) elements outside the block. Thus the chosen median splits the elements somewhere between 30%/70% and 70%/30%, which assures worst-case linear behavior of the algorithm.

Can somebody explain it a bit lucidly for me. I am finding it difficult to understand the logic.

like image 767
SexyBeast Avatar asked Sep 22 '12 16:09

SexyBeast


People also ask

What is median of medians used for?

In computer science, the median of medians is an approximate (median) selection algorithm, frequently used to supply a good pivot for an exact selection algorithm, mainly the quickselect, that selects the kth smallest element of an initially unsorted array.

Can you take the median of medians?

No, unfortunately there is not a way to calculate the median based on medians of subsets of the whole and still be statistically accurate. If you wanted to calculate the mean, however, you could use the means of subsets, given that they are of equal size.

What is the purpose of online median finding problem?

Given a stream of integers, find the median of the set of integers read from the stream so far. If the median is a non-integer, round it down to the nearest integer (Math. floor of the value).


1 Answers

Think of the following set of numbers:

5 2 6 3 1

The median of these numbers is 3. Now if you have a number n, if n > 3, then it is bigger than at least half of the numbers above. If n < 3, then it is smaller than at least half of the numbers above.

So that is the idea. That is, for each set of 5 numbers, you get their median. Now you have n / 5 numbers. This is obvious.

Now if you get the median of those numbers (call it m), it is bigger than half of them and smaller than the other half (by definition of median!). In other words, m is bigger than n / 10 numbers (which themselves were medians of small 5 element groups) and bigger than another n / 10 numbers (which again were medians of small 5 element groups).

In the example above, we saw that if the median is k and you have m > k, then m is also bigger than 2 other numbers (that were themselves smaller than k). This means that for each of those smaller 5 element groups where m was bigger than its median, m is also bigger than two other numbers. This makes it at least 3 numbers (2 numbers + the median itself) in each of those n / 10 small 5 element groups, that are smaller than m. Hence, m is at least bigger than 3n/10 numbers.

Similar logic for the number of elements m is bigger than.

like image 102
Shahbaz Avatar answered Oct 13 '22 13:10

Shahbaz