Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find the index of the k smallest values of a numpy array

Tags:

python

numpy

In order to find the index of the smallest value, I can use argmin:

import numpy as np A = np.array([1, 7, 9, 2, 0.1, 17, 17, 1.5]) print A.argmin()     # 4 because A[4] = 0.1 

But how can I find the indices of the k-smallest values?

I'm looking for something like:

print A.argmin(numberofvalues=3)    # [4, 0, 7]  because A[4] <= A[0] <= A[7] <= all other A[i] 

Note: in my use case A has between ~ 10 000 and 100 000 values, and I'm interested for only the indices of the k=10 smallest values. k will never be > 10.

like image 527
Basj Avatar asked Dec 11 '15 14:12

Basj


People also ask

How do you find the index of minimum value in NumPy array?

The numpy argmin() function takes arr, axis, and out as parameters and returns the array. To find the index of a minimum element from the array, use the np. argmin() function.

Can I index a NumPy array?

Indexing can be done in numpy by using an array as an index. In case of slice, a view or shallow copy of the array is returned but in index array a copy of the original array is returned. Numpy arrays can be indexed with other arrays or any other sequence with the exception of tuples.

How do you find the lowest value in an array in Python?

min() is used for find out minimum value in an array, max() is used for find out maximum value in an array. index() is used for finding the index of the element.

How do you find the N maximum indices of a NumPy array in Python?

In order to get the indices of N maximum values in a NumPy array, we can use the argsort() function.


2 Answers

Use np.argpartition. It does not sort the entire array. It only guarantees that the kth element is in sorted position and all smaller elements will be moved before it. Thus the first k elements will be the k-smallest elements.

import numpy as np  A = np.array([1, 7, 9, 2, 0.1, 17, 17, 1.5]) k = 3  idx = np.argpartition(A, k) print(idx) # [4 0 7 3 1 2 6 5] 

This returns the k-smallest values. Note that these may not be in sorted order.

print(A[idx[:k]]) # [ 0.1  1.   1.5] 

To obtain the k-largest values use

idx = np.argpartition(A, -k) # [4 0 7 3 1 2 6 5]  A[idx[-k:]] # [  9.  17.  17.] 

WARNING: Do not (re)use idx = np.argpartition(A, k); A[idx[-k:]] to obtain the k-largest. That won't always work. For example, these are NOT the 3 largest values in x:

x = np.array([100, 90, 80, 70, 60, 50, 40, 30, 20, 10, 0]) idx = np.argpartition(x, 3) x[idx[-3:]] array([ 70,  80, 100]) 

Here is a comparison against np.argsort, which also works but just sorts the entire array to get the result.

In [2]: x = np.random.randn(100000)  In [3]: %timeit idx0 = np.argsort(x)[:100] 100 loops, best of 3: 8.26 ms per loop  In [4]: %timeit idx1 = np.argpartition(x, 100)[:100] 1000 loops, best of 3: 721 µs per loop  In [5]: np.alltrue(np.sort(np.argsort(x)[:100]) == np.sort(np.argpartition(x, 100)[:100])) Out[5]: True 
like image 161
unutbu Avatar answered Sep 19 '22 22:09

unutbu


You can use numpy.argsort with slicing

>>> import numpy as np >>> A = np.array([1, 7, 9, 2, 0.1, 17, 17, 1.5]) >>> np.argsort(A)[:3] array([4, 0, 7], dtype=int32) 
like image 25
Cory Kramer Avatar answered Sep 20 '22 22:09

Cory Kramer