Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I find the median of numbers in linear time using heaps?

Wikipedia says:

Selection algorithms: Finding the min, max, both the min and max, median, or even the k-th largest element can be done in linear time using heaps.

All it says is that it can be done, and not how.

Can you give me some start on how this can be done using heaps?

like image 891
Lazer Avatar asked Apr 05 '10 17:04

Lazer


People also ask

How do you find the median of an array in linear time?

To find the median of an unsorted array, we can make a min-heap in O(nlogn) time for n elements, and then we can extract one by one n/2 elements to get the median. But this approach would take O(nlogn) time.

Can median be found in linear time?

The median-of-medians algorithm is a deterministic linear-time selection algorithm. The algorithm works by dividing a list into sublists and then determines the approximate median in each of the sublists. Then, it takes those medians and puts them into a list and finds the median of that list.


1 Answers

You would use a min-max-median heap to find the min, max and median in constant time (and take linear time to build the heap). You can use order-statistics trees to find the kth smallest/largest value. Both of these data structures are described in this paper on min-max heaps [PDF]. Min-max heaps are binary heaps that alternate between min-heaps and max-heaps.

From the paper:

A min-max-median heap is a binary tree with the following properties:

  1. The median of all elements is located at the root

  2. The left subtree of the root is a min-max heap Hl of size ceiling[((n-1)/2)] containing elements less than or equal to the median. The right subtree is a max-min heap Hr of size floor[((n-1)/2)] containing only elements greater than or equal to the median.

The paper goes on to explain how to build such a heap.

Upon reading the paper more thoroughly it appears as though building the min-max-median heaps requires that you first find the median (FTA: "Find the median of all n elements using any one of the known linear-time algorithms"). That said, once you have built the heap you can maintain the median simply by maintaining the balance between the min-max heap on the left and the max-min heap on the right. DeleteMedian replaces the root with either the min of the max-min heap or the max of the min-max heap (whichever maintains the balance).

So if you plan on using a min-max-median heap to find the median of a fixed data set you're SOL but if you are using it on a changing data set it is possible.

like image 200
Niki Yoshiuchi Avatar answered Oct 14 '22 04:10

Niki Yoshiuchi