Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

O(n) algorithm to find the median of n² implicit numbers

Problem: input is a (not necessarily sorted) sequence S = k1, k2, ..., kn of n arbitrary numbers. Consider the collection C of n² numbers of the form min{ki,kj}, for 1 <=i, j<=n. Present an O(n) time and O(n) space algorithm to find the median of C.

So far I've found by examining C for different sets S that the number of instances of the smallest number in S in C is equal to (2n-1), the next smallest number: (2n-3) and so on until you only have one instance of the largest number.

Is there a way to use this information to find the median of C?

like image 388
ejf071189 Avatar asked Nov 17 '10 03:11

ejf071189


People also ask

Which algorithm is used to find median?

The median-of-medians algorithm is a deterministic linear-time selection algorithm. The algorithm works by dividing a list into sublists and then determines the approximate median in each of the sublists. Then, it takes those medians and puts them into a list and finds the median of that list.

What's the best algorithm to calculate the median of a sequence of numbers?

The most straightforward way to find the median is to sort the list and just pick the median by its index. The fastest comparison-based sort is O(nlogn) , so that dominates the runtime.

How do you find the median of n numbers?

If the number of observations is odd, the number in the middle of the list is the median. This can be found by taking the value of the (n+1)/2 -th term, where n is the number of observations. Else, if the number of observations is even, then the median is the simple average of the middle two numbers.

How do you find the median of O 1?

So insertion is done in O(lg n) time and getting the median is done in O(1) time. To find the Median, place the numbers you are given in value order and find the middle number. Example: find the Median of {13, 23, 11, 16, 15, 10, 26}. The middle number is 15, so the median is 15.


3 Answers

There are a number of possibilities. One I like is Hoare's Select algorithm. The basic idea is similar to a Quicksort, except that when you recurse, you only recurse into the partition that will hold the number(s) you're looking for.

For example, if you want the median of 100 numbers, you'd start by partitioning the array, just like in Quicksort. You'd get two partitions -- one of which contains the 50th element. Recursively carry out your selection in that partition. Continue until your partition contains only one element, which will be the median (and note that you can do the same for another element of your choice).

like image 80
Jerry Coffin Avatar answered Oct 18 '22 21:10

Jerry Coffin


Yes, good puzzle. We can find median developing on the lines you said.

In C we have 1 occurence of max(k), 3 occurrence of next highest, 5 of next highest and so on

  1. If we ordered elements of C, number of elements on the left of mth highest number is m^2 (sum of odd numbers)

  2. The numbers that we are interested in (to calculate median) a. If n is odd is (n^2+1)/2 = alpha b. If n is even then alpha1 = n^2/2 and alpha2 = n^2/2+1 but alpha1=n^2/2 is never a square number => the number immediately on the right of alpha1 is equal to alpha1 (sum of first m odd numbers is square) => alpha1=alpha2.

  3. So it boils down to determining m such that m^2 (sum of first m odd numbers) is just higher than (n^2/2)

  4. So it boils down to determining m=ceiling(n/sqrt(2) and mth highest number in original sequence. (Whether to find mth highest or (n-m-1)th lowest is optimization).

  5. We can easily find mth highest number (just keep noting first m largest number from left) or use median of medians algortithm to do it in linear time.

like image 22
Om Deshmane Avatar answered Oct 18 '22 21:10

Om Deshmane


Wikipedia has a good article on Selection algorithms. If you are using C++, the STL includes a nth_element() algorithm with linear time on average.

like image 31
Blastfurnace Avatar answered Oct 18 '22 22:10

Blastfurnace