Input example:
I have a numpy array, e.g.
a=np.array([[0,1], [2, 1], [4, 8]])
Desired output:
I would like to produce a mask array with the max value along a given axis, in my case axis 1, being True and all others being False. e.g. in this case
mask = np.array([[False, True], [True, False], [False, True]])
Attempt:
I have tried approaches using np.amax
but this returns the max values in a flattened list:
>>> np.amax(a, axis=1)
array([1, 2, 8])
and np.argmax
similarly returns the indices of the max values along that axis.
>>> np.argmax(a, axis=1)
array([1, 0, 1])
I could iterate over this in some way but once these arrays become bigger I want the solution to remain something native in numpy.
There is argmin() and argmax() provided by numpy that returns the index of the min and max of a numpy array respectively. Note that these will only return the index of the first occurrence.
The [:, :] stands for everything from the beginning to the end just like for lists. The difference is that the first : stands for first and the second : for the second dimension. a = numpy. zeros((3, 3)) In [132]: a Out[132]: array([[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.]])
Boolean masking, also called boolean indexing, is a feature in Python NumPy that allows for the filtering of values in numpy arrays. There are two main ways to carry out boolean masking: Method one: Returning the result array.
Returns the indices of the maximum values along an axis. Input array. By default, the index is into the flattened array, otherwise along the specified axis.
Method #1
Using broadcasting
, we can use comparison against the max values, while keeping dims to facilitate broadcasting
-
a.max(axis=1,keepdims=1) == a
Sample run -
In [83]: a
Out[83]:
array([[0, 1],
[2, 1],
[4, 8]])
In [84]: a.max(axis=1,keepdims=1) == a
Out[84]:
array([[False, True],
[ True, False],
[False, True]], dtype=bool)
Method #2
Alternatively with argmax
indices for one more case of broadcasted-comparison
against the range of indices along the columns -
In [92]: a.argmax(axis=1)[:,None] == range(a.shape[1])
Out[92]:
array([[False, True],
[ True, False],
[False, True]], dtype=bool)
Method #3
To finish off the set, and if we are looking for performance, use intialization and then advanced-indexing
-
out = np.zeros(a.shape, dtype=bool)
out[np.arange(len(a)), a.argmax(axis=1)] = 1
Create an identity matrix and select from its rows using argmax
on your array:
np.identity(a.shape[1], bool)[a.argmax(axis=1)]
# array([[False, True],
# [ True, False],
# [False, True]], dtype=bool)
Please note that this ignores ties, it just goes with the value returned by argmax
.
You're already halfway in the answer. Once you compute the max along an axis, you can compare it with the input array and you'll have the required binary mask!
In [7]: maxx = np.amax(a, axis=1)
In [8]: maxx
Out[8]: array([1, 2, 8])
In [12]: a >= maxx[:, None]
Out[12]:
array([[False, True],
[ True, False],
[False, True]], dtype=bool)
Note: This uses NumPy broadcasting when doing the comparison between a
and maxx
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With