I know that I can get min or max values with:
max(matrix)
min(matrix)
out of a numpy matrix/vector. The indices for those vales are returned by:
argmax(matrix)
argmin(matrix)
So e.g. when I have a 5x5 matrix:
a = np.arange(5*5).reshape(5, 5) + 10
# array([[10, 11, 12, 13, 14],
# [15, 16, 17, 18, 19],
# [20, 21, 22, 23, 24],
# [25, 26, 27, 28, 29],
# [30, 31, 32, 33, 34]])
I could get the max value via:
In [86]: np.max(a) # getting the max-value out of a
Out[86]: 34
In [87]: np.argmax(a) # index of max-value 34 is 24 if array a were flattened
Out[87]: 24
...but what is the most efficient way to get the max or min n-elements?
So let's say out of a I want to have the 5 highest and 5 lowest elements. This should return me [30, 31, 32, 33, 34]
for the 5 highest values respectively [20, 21, 22, 23, 24]
for their indices. Likewise [10, 11, 12, 13, 14]
for the 5 lowest values and [0, 1, 2, 3, 4]
for the indices of the 5 lowest elements.
What would be an efficient, reasonable solution for this?
My first idea was flattening and sorting the array and taking the last and first 5 values. Afterwards I search through the original 2D matrix for the indices of those values. Although this procedure works flattening + sorting isn't very efficient...does anyone know a faster solution?
Additionally I would like to have the indices of the original 2D array and not the flattening one. So instead of 24
returned by np.argmax(a)
I would like to have (4, 4)
.
numpy. amax() will find the max value in an array, and numpy. amin() does the same for the min value.
Return a copy of the array collapsed into one dimension. 'C' means to flatten in row-major (C-style) order. 'F' means to flatten in column-major (Fortran- style) order.
nanmax() to find the maximum values while ignoring nan values, as well as np. argmax() or . argmax() to find the indices of the maximum values. You won't be surprised to learn that NumPy has an equivalent set of minimum functions: np.
By using ndarray. flatten() function we can flatten a matrix to one dimension in python. order:'C' means to flatten in row-major. 'F' means to flatten in column-major.
The standard way to get the indices of the largest or smallest values in an array is to use np.argpartition
. This function uses an introselect algorithm and runs with linear complexity - this performs better than fully sorting for larger arrays (which is typically O(n log n)).
By default this function works along the last axis of the array. To consider an entire array, you need to use ravel()
. For example, here's a random array a
:
>>> a = np.random.randint(0, 100, size=(5, 5))
>>> a
array([[60, 68, 86, 66, 9],
[66, 26, 83, 87, 50],
[41, 26, 0, 55, 9],
[57, 80, 71, 50, 22],
[94, 30, 95, 99, 76]])
Then to get the indices of the five largest values in the (flattened) 2D array, use:
>>> i = np.argpartition(a.ravel(), -5)[-5:] # argpartition(a.ravel(), 5)[:5] for smallest
>>> i
array([ 2, 8, 22, 23, 20])
To get back the corresponding 2D indices of these positions in a
, use unravel_index
:
>>> i2d = np.unravel_index(i, a.shape)
>>> i2d
(array([0, 1, 4, 4, 4]), array([2, 3, 2, 3, 0]))
Then indexing a
with i2d
gives back the five largest values:
>>> a[i2d]
array([86, 87, 95, 99, 94])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With