Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mask from max values in numpy array, specific axis

Tags:

python

numpy

Input example:

I have a numpy array, e.g.

a=np.array([[0,1], [2, 1], [4, 8]])

Desired output:

I would like to produce a mask array with the max value along a given axis, in my case axis 1, being True and all others being False. e.g. in this case

mask = np.array([[False, True], [True, False], [False, True]])

Attempt:

I have tried approaches using np.amax but this returns the max values in a flattened list:

>>> np.amax(a, axis=1)
array([1, 2, 8])

and np.argmax similarly returns the indices of the max values along that axis.

>>> np.argmax(a, axis=1)
array([1, 0, 1])

I could iterate over this in some way but once these arrays become bigger I want the solution to remain something native in numpy.

like image 615
feedMe Avatar asked Dec 06 '17 15:12

feedMe


People also ask

How do you find the index of a max value in a NumPy array?

There is argmin() and argmax() provided by numpy that returns the index of the min and max of a numpy array respectively. Note that these will only return the index of the first occurrence.

What does [: :] mean on NumPy arrays?

The [:, :] stands for everything from the beginning to the end just like for lists. The difference is that the first : stands for first and the second : for the second dimension. a = numpy. zeros((3, 3)) In [132]: a Out[132]: array([[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.]])

What is Boolean masking in NumPy?

Boolean masking, also called boolean indexing, is a feature in Python NumPy that allows for the filtering of values in numpy arrays. There are two main ways to carry out boolean masking: Method one: Returning the result array.

What does argmax do in NumPy?

Returns the indices of the maximum values along an axis. Input array. By default, the index is into the flattened array, otherwise along the specified axis.


3 Answers

Method #1

Using broadcasting, we can use comparison against the max values, while keeping dims to facilitate broadcasting -

a.max(axis=1,keepdims=1) == a

Sample run -

In [83]: a
Out[83]: 
array([[0, 1],
       [2, 1],
       [4, 8]])

In [84]: a.max(axis=1,keepdims=1) == a
Out[84]: 
array([[False,  True],
       [ True, False],
       [False,  True]], dtype=bool)

Method #2

Alternatively with argmax indices for one more case of broadcasted-comparison against the range of indices along the columns -

In [92]: a.argmax(axis=1)[:,None] == range(a.shape[1])
Out[92]: 
array([[False,  True],
       [ True, False],
       [False,  True]], dtype=bool)

Method #3

To finish off the set, and if we are looking for performance, use intialization and then advanced-indexing -

out = np.zeros(a.shape, dtype=bool)
out[np.arange(len(a)), a.argmax(axis=1)] = 1
like image 132
Divakar Avatar answered Sep 28 '22 05:09

Divakar


Create an identity matrix and select from its rows using argmax on your array:

np.identity(a.shape[1], bool)[a.argmax(axis=1)]
# array([[False,  True],
#        [ True, False],
#        [False,  True]], dtype=bool)

Please note that this ignores ties, it just goes with the value returned by argmax.

like image 43
Paul Panzer Avatar answered Sep 28 '22 05:09

Paul Panzer


You're already halfway in the answer. Once you compute the max along an axis, you can compare it with the input array and you'll have the required binary mask!

In [7]: maxx = np.amax(a, axis=1)

In [8]: maxx
Out[8]: array([1, 2, 8])

In [12]: a >= maxx[:, None]
Out[12]: 
array([[False,  True],
       [ True, False],
       [False,  True]], dtype=bool)

Note: This uses NumPy broadcasting when doing the comparison between a and maxx

like image 20
kmario23 Avatar answered Sep 28 '22 05:09

kmario23