Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find the largest 10% of numbers in an array in order

Given an array with 'N' numbers (N >100). How could we find the largest 10% of them in order ? (if n/10 is not an integer, we can round it)

I came up with 3 algorithms to attempt the above problem, but I am not sure which one is the best in terms of asymptotic running time. Could I actually make any modification to reduce the asymptotic time? Also, if N gets really large, which algorithm might still be efficient one?

I'm listing my ideas for the algorithms below and could really use some help to figure out the most efficient algorithm for this.

Algo-1

I used selection sort and stopped it once the 10% of numbers were sorted.

Algo-2

I built a max-heap and kept removing the largest 10% of numbers

Algo-3

Haven't implemented this one, but the idea I have is to use any order-statistic algorithm to find a partition containing the top 10% of numbers and then sort them using merge sort.

like image 454
noviceneedshelp Avatar asked Feb 28 '10 16:02

noviceneedshelp


1 Answers

The quickest solution is to use the partition-based selection algorithm, which runs in O(n). It's based of the quicksort idea, except instead of sorting both partitions recursively, you only go to one of the partitions to find the k-th smallest element.

Finding the largest 10% is accomplished by searching for the k=(90%*N)-th smallest number.

If you recall how partitioning in quicksort works, elements less than the pivot is moved to the left, and the rest of the elements go to the right. Let's say you want to select the k-th smallest element. Then you see if there are at least k elements to the left of the pivot. If there are, then you know you can ignore the elements in the right partition. Otherwise, you can ignore all elements in the left partition, because you know that element will be in the right partition.

Note that the selection algorithm only identifies what those top 10% numbers are. If you need them to be sorted, then you have to sort those numbers (but only those numbers, the other 90% can be ignored).

like image 108
polygenelubricants Avatar answered Oct 23 '22 21:10

polygenelubricants