I'm sorry in advance if this is a duplicated question, I looked for this information but still couldn't find it.
Is it possible to arrange a numpy array (or python list) by using the indexes of the N biggest elements in decreasing order very efficiently?
For instance, the array:
a = array([4, 1, 0, 8, 5, 2])
The indexes of the biggest elements in decreasing order would give (considering N = 6, all the elements are included):
8 --> 3
5 --> 4
4 --> 0
2 --> 5
1 --> 1
0 --> 2
result = [3, 4, 0, 5, 1, 2]
I know how to make it using a somewhat silly approach (like sorting the array and searching for each of the N numbers for their indexes), but I was wondering if is there any efficient library like bottleneck or heapq or maybe a pythonic approach to make this very fast. I have to apply it in several arrays with 300k elements each so that's why performance is an issue.
Thanks in advance!
UPDATE
I read the answers and decided to timeit them using a 300k of random integers, here are the results:
solution 1: sorted(range(len(a)), key=lambda i:a[i])
time: 230 ms
solution 2: heapq.nlargest(len(a), zip(a, itertools.count()))
time: 396 ms
solution 3: heapq.nlargest(len(a), enumerate(a), key=operator.itemgetter(1))
time: 864 ms
solution 4: def f(a,N): return np.argsort(a)[::-1][:N] (N = len(a))
time: 104 ms
Thanks a lot for the fast and very good answers!
Use for loop to find out the index of the maximum value in a list. Use the max() and list. index() functions to find out the index of the maximum value in a list. Use the enumerate() function to find out the index of the maximum value in a list.
For getting n-largest values from a NumPy array we have to first sort the NumPy array using numpy. argsort() function of NumPy then applying slicing concept with negative indexing. Return: [index_array, ndarray] Array of indices that sort arr along the specified axis.
In order to get the indices of N maximum values in a NumPy array, we can use the argsort() function.
Python also has a built-in max() function that can calculate maximum values of iterables. You can use this built-in max() to find the maximum element in a one-dimensional NumPy array, but it has no support for arrays with more dimensions.
Have you looked at the built-in numpy argsort
method?:
http://docs.scipy.org/doc/numpy/reference/generated/numpy.argsort.html
I can sort an array with 300,000 random floats in about 29 ms on my machine using that method.
def f(a,N): return np.argsort(a)[::-1][:N]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With