Considering an histogram of shape 100x100x100, I would like to find the 2 highest values a and b, and their indices (a1, a2, a3) and (b1, b2, b3), such as:
hist[a1][a2][a3] = a
hist[b1][b2][b3] = b
We can easily get the highest value with hist.max(), but how can we get the X highest values in a ndarray?
I understand that one normally uses np.argmax to retrieve the value indices, but in that case:
hist.argmax().shape = () # single value
for i in range(3):
hist.argmax(i).shape = (100, 100)
How can I get a shape (3), a tuple with one value per dimension?
For getting n-largest values from a NumPy array we have to first sort the NumPy array using numpy. argsort() function of NumPy then applying slicing concept with negative indexing. Return: [index_array, ndarray] Array of indices that sort arr along the specified axis.
Python also has a built-in max() function that can calculate maximum values of iterables. You can use this built-in max() to find the maximum element in a one-dimensional NumPy array, but it has no support for arrays with more dimensions.
To get the indices of N miniumum values in NumPy in an optimal way, use the argpartition(~) method.
What is the difference between NumPy and SciPy? In an ideal world, NumPy would contain nothing but the array data type and the most basic operations: indexing, sorting, reshaping, basic elementwise functions, etc. All numerical code would reside in SciPy.
You can use numpy.argpartition
on flattened version of array first to get the indices of top k
items, and then you can convert those 1D indices as per the array's shape using numpy.unravel_index
:
>>> arr = np.arange(100*100*100).reshape(100, 100, 100)
>>> np.random.shuffle(arr)
>>> indices = np.argpartition(arr.flatten(), -2)[-2:]
>>> np.vstack(np.unravel_index(indices, arr.shape)).T
array([[97, 99, 98],
[97, 99, 99]])
)
>>> arr[97][99][98]
999998
>>> arr[97][99][99]
999999
You could use where:
a=np.random.random((100,100,100))
np.where(a==a.max())
(array([46]), array([62]), array([61]))
to get in a single array:
np.hstack(np.where(a==a.max()))
array([46, 62, 61])
and, as the OP asked for a tuple:
tuple(np.hstack(np.where(a==a.max())))
(46, 62, 61)
EDIT:
To get the indices of the N
largest sets you could use the nlargest
function from the heapq
module:
N=3
np.where(a>=heapq.nlargest(3,a.flatten())[-1])
(array([46, 62, 61]), array([95, 85, 97]), array([70, 35, 2]))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With