Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get the indices of N highest values in an ndarray

Considering an histogram of shape 100x100x100, I would like to find the 2 highest values a and b, and their indices (a1, a2, a3) and (b1, b2, b3), such as:

hist[a1][a2][a3] = a
hist[b1][b2][b3] = b

We can easily get the highest value with hist.max(), but how can we get the X highest values in a ndarray?

I understand that one normally uses np.argmax to retrieve the value indices, but in that case:

hist.argmax().shape = ()  # single value
for i in range(3):
    hist.argmax(i).shape = (100, 100)

How can I get a shape (3), a tuple with one value per dimension?

like image 537
Adrien Lemaire Avatar asked Oct 28 '14 08:10

Adrien Lemaire


People also ask

How do you find the indices of N maximum values in a NumPy array?

For getting n-largest values from a NumPy array we have to first sort the NumPy array using numpy. argsort() function of NumPy then applying slicing concept with negative indexing. Return: [index_array, ndarray] Array of indices that sort arr along the specified axis.

How do you find the largest value in a NumPy array?

Python also has a built-in max() function that can calculate maximum values of iterables. You can use this built-in max() to find the maximum element in a one-dimensional NumPy array, but it has no support for arrays with more dimensions.

How do I get indices of N minimum values in a NumPy array?

To get the indices of N miniumum values in NumPy in an optimal way, use the argpartition(~) method.

What is the difference between NumPy and SciPy?

What is the difference between NumPy and SciPy? In an ideal world, NumPy would contain nothing but the array data type and the most basic operations: indexing, sorting, reshaping, basic elementwise functions, etc. All numerical code would reside in SciPy.


2 Answers

You can use numpy.argpartition on flattened version of array first to get the indices of top k items, and then you can convert those 1D indices as per the array's shape using numpy.unravel_index:

>>> arr = np.arange(100*100*100).reshape(100, 100, 100)
>>> np.random.shuffle(arr)
>>> indices =  np.argpartition(arr.flatten(), -2)[-2:]
>>> np.vstack(np.unravel_index(indices, arr.shape)).T
array([[97, 99, 98],
       [97, 99, 99]])
)
>>> arr[97][99][98]
999998
>>> arr[97][99][99]
999999
like image 169
Ashwini Chaudhary Avatar answered Sep 18 '22 21:09

Ashwini Chaudhary


You could use where:

a=np.random.random((100,100,100))
np.where(a==a.max())
(array([46]), array([62]), array([61]))

to get in a single array:

np.hstack(np.where(a==a.max()))
array([46, 62, 61])

and, as the OP asked for a tuple:

tuple(np.hstack(np.where(a==a.max())))
(46, 62, 61)

EDIT:

To get the indices of the N largest sets you could use the nlargest function from the heapq module:

N=3
np.where(a>=heapq.nlargest(3,a.flatten())[-1])
(array([46, 62, 61]), array([95, 85, 97]), array([70, 35,  2]))
like image 28
atomh33ls Avatar answered Sep 17 '22 21:09

atomh33ls