Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding the Index of N biggest elements in Python Array / List Efficiently

Tags:

I'm sorry in advance if this is a duplicated question, I looked for this information but still couldn't find it.

Is it possible to arrange a numpy array (or python list) by using the indexes of the N biggest elements in decreasing order very efficiently?

For instance, the array:

a = array([4, 1, 0, 8, 5, 2]) 

The indexes of the biggest elements in decreasing order would give (considering N = 6, all the elements are included):

8 --> 3

5 --> 4

4 --> 0

2 --> 5

1 --> 1

0 --> 2

result = [3, 4, 0, 5, 1, 2] 

I know how to make it using a somewhat silly approach (like sorting the array and searching for each of the N numbers for their indexes), but I was wondering if is there any efficient library like bottleneck or heapq or maybe a pythonic approach to make this very fast. I have to apply it in several arrays with 300k elements each so that's why performance is an issue.

Thanks in advance!

UPDATE

I read the answers and decided to timeit them using a 300k of random integers, here are the results:

solution 1: sorted(range(len(a)), key=lambda i:a[i]) time: 230 ms

solution 2: heapq.nlargest(len(a), zip(a, itertools.count())) time: 396 ms

solution 3: heapq.nlargest(len(a), enumerate(a), key=operator.itemgetter(1)) time: 864 ms

solution 4: def f(a,N): return np.argsort(a)[::-1][:N] (N = len(a)) time: 104 ms

Thanks a lot for the fast and very good answers!

like image 627
Willian Fuks Avatar asked Oct 08 '12 18:10

Willian Fuks


People also ask

How do you find the index of the largest number in a list in Python?

Use for loop to find out the index of the maximum value in a list. Use the max() and list. index() functions to find out the index of the maximum value in a list. Use the enumerate() function to find out the index of the maximum value in a list.

How do you find the N largest value in an array?

For getting n-largest values from a NumPy array we have to first sort the NumPy array using numpy. argsort() function of NumPy then applying slicing concept with negative indexing. Return: [index_array, ndarray] Array of indices that sort arr along the specified axis.

How do you find the N maximum indices of a NumPy array in Python?

In order to get the indices of N maximum values in a NumPy array, we can use the argsort() function.

How do you find the highest value in an array in Python?

Python also has a built-in max() function that can calculate maximum values of iterables. You can use this built-in max() to find the maximum element in a one-dimensional NumPy array, but it has no support for arrays with more dimensions.


1 Answers

Have you looked at the built-in numpy argsort method?:

http://docs.scipy.org/doc/numpy/reference/generated/numpy.argsort.html

I can sort an array with 300,000 random floats in about 29 ms on my machine using that method.

def f(a,N):     return np.argsort(a)[::-1][:N] 
like image 157
JoshAdel Avatar answered Oct 16 '22 01:10

JoshAdel