Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Prepare array in linear time to find k smallest elements in O(k)

Tags:

This is an interesting question I have found on the web. Given an array containing n numbers (with no information about them), we should pre-process the array in linear time so that we can return the k smallest elements in O(k) time, when we are given a number 1 <= k <= n

I have been discussing this problem with some friends but no one could find a solution; any help would be appreciated!

like image 781
Idan Avatar asked Jun 23 '13 13:06

Idan


People also ask

What is the kth smallest element?

Definition of kth smallest element kth smallest element is the minimum possible n such that there are at least k elements in the array <= n.


2 Answers

For the pre-processing step, we will use the partition-based selection several times on the same data set.

Find the n/2-th number with the algorithm.. now the dataset is partitioned into two half, lower and upper. On the lower half find again the middlepoint. On its lower partition do the same thing and so on... Overall this is O(n) + O(n/2) + O(n/4) + ... = O(n).

Now when you have to return the k smallest elements, search for the nearest x < k, where x is a partition boundary. Everything below it can be returned, and from the next partition you have to return k - x numbers. Since the next partition's size is O(k), running another selection algorithm for the k - x th number will return the rest.

like image 94
Karoly Horvath Avatar answered Oct 19 '22 05:10

Karoly Horvath


We can find the median of a list and partition around it in linear time.

Then we can use the following algorithm: maintain a buffer of size 2k.

Every time the buffer gets full, we find the median and partition around it, keeping only the lowest k elements.
This requires n/k find-median-and-partition steps, each of which take O(k) time with a traditional quickselect. this approach requires only O(n) time.

Additionally if you need the sorted output. Which adds an additional O(k log k) time. In total, this approach requires only O(n + k log k) time and O(k) space.

like image 42
Jayram Avatar answered Oct 19 '22 07:10

Jayram