Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Data structure to find median

This is an interview question. Design a class, which stores integers and provides two operations:

void insert(int k)
int getMedian()

I guess I can use BST so that insert takes O(logN) and getMedian takes O(logN) (for getMedian I should add the number of of left/right children for each node).

Now I wonder if this is the most efficient solution and there is no better one.

like image 887
Michael Avatar asked Jul 06 '12 11:07

Michael


4 Answers

You can use 2 heaps, that we will call Left and Right.
Left is a Max-Heap.
Right is a Min-Heap.
Insertion is done like this:

  • If the new element x is smaller than the root of Left then we insert x to Left.
  • Else we insert x to Right.
  • If after insertion Left has count of elements that is greater than 1 from the count of elements of Right, then we call Extract-Max on Left and insert it to Right.
  • Else if after insertion Right has count of elements that is greater than the count of elements of Left, then we call Extract-Min on Right and insert it to Left.

The median is always the root of Left.

So insertion is done in O(lg n) time and getting the median is done in O(1) time.

like image 143
Avi Cohen Avatar answered Sep 28 '22 17:09

Avi Cohen


See this Stack Overflow question for a solution that involves two heaps.

like image 36
user448810 Avatar answered Sep 28 '22 17:09

user448810


Would it beat an array of integers witch performs a sort at insertion time with a sort algorithm dedicated for integer (http://en.wikipedia.org/wiki/Sorting_algorithm) if you choose your candidate amongst O < O(log(n)) and using an array, then getMedian would be taking the index at half of the size would be O(1), no? It seems possible to me to do better than log(n) + log(n).

Plus by being a little more flexible you can improve your performance by changing your sort algorithm according to the properties of your input (are the input almost sorted or not ...).

I am pretty much autodidact in computer science, but that is the way I would do it: simpler is better.

like image 23
user1458574 Avatar answered Sep 28 '22 16:09

user1458574


You could consider a self-balancing tree, too. If the tree is fully balanced, then the root node is your median. Say, the tree is one-level deeper on one end. Then, you just need to know how many nodes are there in the deeper-side to pick the correct median.

like image 36
grdvnl Avatar answered Sep 28 '22 16:09

grdvnl