Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get max or min n-elements out of numpy array? (preferably not flattened)

I know that I can get min or max values with:

max(matrix)
min(matrix)

out of a numpy matrix/vector. The indices for those vales are returned by:

argmax(matrix)
argmin(matrix)

So e.g. when I have a 5x5 matrix:

a = np.arange(5*5).reshape(5, 5) + 10

# array([[10, 11, 12, 13, 14],
#        [15, 16, 17, 18, 19],
#        [20, 21, 22, 23, 24],
#        [25, 26, 27, 28, 29],
#        [30, 31, 32, 33, 34]])

I could get the max value via:

In [86]: np.max(a) # getting the max-value out of a
Out[86]: 34

In [87]: np.argmax(a) # index of max-value 34 is 24 if array a were flattened
Out[87]: 24

...but what is the most efficient way to get the max or min n-elements?

So let's say out of a I want to have the 5 highest and 5 lowest elements. This should return me [30, 31, 32, 33, 34] for the 5 highest values respectively [20, 21, 22, 23, 24] for their indices. Likewise [10, 11, 12, 13, 14] for the 5 lowest values and [0, 1, 2, 3, 4] for the indices of the 5 lowest elements.

What would be an efficient, reasonable solution for this?

My first idea was flattening and sorting the array and taking the last and first 5 values. Afterwards I search through the original 2D matrix for the indices of those values. Although this procedure works flattening + sorting isn't very efficient...does anyone know a faster solution?

Additionally I would like to have the indices of the original 2D array and not the flattening one. So instead of 24 returned by np.argmax(a) I would like to have (4, 4).

like image 935
daniel451 Avatar asked Jan 19 '16 14:01

daniel451


People also ask

How do you find the max and min value of a NumPy array?

numpy. amax() will find the max value in an array, and numpy. amin() does the same for the min value.

What does flatten () do in NumPy?

Return a copy of the array collapsed into one dimension. 'C' means to flatten in row-major (C-style) order. 'F' means to flatten in column-major (Fortran- style) order.

What is the way to find the maximum number in the NumPy array?

nanmax() to find the maximum values while ignoring nan values, as well as np. argmax() or . argmax() to find the indices of the maximum values. You won't be surprised to learn that NumPy has an equivalent set of minimum functions: np.

How do I flatten an array in NumPy?

By using ndarray. flatten() function we can flatten a matrix to one dimension in python. order:'C' means to flatten in row-major. 'F' means to flatten in column-major.


1 Answers

The standard way to get the indices of the largest or smallest values in an array is to use np.argpartition. This function uses an introselect algorithm and runs with linear complexity - this performs better than fully sorting for larger arrays (which is typically O(n log n)).

By default this function works along the last axis of the array. To consider an entire array, you need to use ravel(). For example, here's a random array a:

>>> a = np.random.randint(0, 100, size=(5, 5))
>>> a
array([[60, 68, 86, 66,  9],
       [66, 26, 83, 87, 50],
       [41, 26,  0, 55,  9],
       [57, 80, 71, 50, 22],
       [94, 30, 95, 99, 76]])

Then to get the indices of the five largest values in the (flattened) 2D array, use:

>>> i = np.argpartition(a.ravel(), -5)[-5:] # argpartition(a.ravel(), 5)[:5] for smallest
>>> i
array([ 2,  8, 22, 23, 20])

To get back the corresponding 2D indices of these positions in a, use unravel_index:

>>> i2d = np.unravel_index(i, a.shape)
>>> i2d
(array([0, 1, 4, 4, 4]), array([2, 3, 2, 3, 0]))

Then indexing a with i2d gives back the five largest values:

>>> a[i2d]
array([86, 87, 95, 99, 94])
like image 174
Alex Riley Avatar answered Nov 15 '22 00:11

Alex Riley