Given the following problem:
"Store the largest 5000 numbers from a stream of numbers"
The solution which springs to mind is a binary search tree maintaining a count of the number of nodes in the tree and a reference to the smallest node once the count reaches 5000. When the count reaches 5000, each new number to add can be compared to the smallest item in the tree. If greater, the new number can be added then the smallest removed and the new smallest calculated (which should be very simple already having the previous smallest).
My concern with this solution is that the binary tree is naturally going to get skewed (as I'm only deleting on one side).
Is there a way to solve this problem which won't create a terribly skewed tree?
In case anyone wants it, I've included pseudo-code for my solution so far below:
process(number)
{
if (count == 5000 && number > smallest.Value)
{
addNode( root, number)
smallest = deleteNodeAndGetNewSmallest ( root, smallest)
}
}
deleteNodeAndGetNewSmallest( lastSmallest)
{
if ( lastSmallest has parent)
{
if ( lastSmallest has right child)
{
smallest = getMin(lastSmallest.right)
lastSmallest.parent.right = lastSmallest.right
}
else
{
smallest = lastSmallest.parent
}
}
else
{
smallest = getMin(lastSmallest.right)
root = lastSmallest.right
}
count--
return smallest
}
getMin( node)
{
if (node has left)
return getMin(node.left)
else
return node
}
add(number)
{
//standard implementation of add for BST
count++
}
The simplest solution for this is maintaining a min heap of max size 5000.
This solution is O(nlogk)
complexity, where n
is the number of elements and k
is the number of elements you need (5000 in your case).
It can be done also in O(n)
using selection algorithm - store all the elements, and then find the 5001th largest element, and return everything bigger than it. But it is harder to implement and for reasonable size input - might not be better. Also, if stream contains duplicates, more processing is needed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With