I want to apply boolean masking both to rows and columns. With <pre class="prettyprint"><code>X = np.array([[1,2,3],[4,5,6]]) mask1 = np.array([True, True]) mask2 = np.array([True, True, False]) X[mask1, mask2] </code></pre> I expect the output to be <pre class="prettyprint"><code>array([[1,2],[4,5]]) </code></pre> instead of <pre class="prettyprint"><code>array([1,5]) </code></pre> It's known that <pre class="prettyprint"><code>X[:, mask2] </code></pre> can be used here but that's not a solution for the general case. I would like to know how it works under the hood and why in this case the result is <code>array([1,5])</code>.

<code>X[mask1, mask2]</code> is described in Boolean Array Indexing Doc as the equivalent of <pre class="prettyprint"><code>In [249]: X[mask1.nonzero()[0], mask2.nonzero()[0]] Out[249]: array([1, 5]) In [250]: X[[0,1], [0,1]] Out[250]: array([1, 5]) </code></pre> In effect it is giving you <code>X[0,0]</code> and <code>X[1,1]</code> (pairing the 0s and 1s). What you want instead is: <pre class="prettyprint"><code>In [251]: X[[[0],[1]], [0,1]] Out[251]: array([[1, 2], [4, 5]]) </code></pre> <code>np.ix_</code> is a handy tool for creating the right mix of dimensions <pre class="prettyprint"><code>In [258]: np.ix_([0,1],[0,1]) Out[258]: (array([[0], [1]]), array([[0, 1]])) In [259]: X[np.ix_([0,1],[0,1])] Out[259]: array([[1, 2], [4, 5]]) </code></pre> That's effectively a column vector for the 1st axis and row vector for the second, together defining the desired rectangle of values. But trying to broadcast boolean arrays like this does not work: <code>X[mask1[:,None], mask2]</code> But that reference section says: <blockquote> Combining multiple Boolean indexing arrays or a Boolean with an integer indexing array can best be understood with the obj.nonzero() analogy. The function ix_ also supports boolean arrays and will work without any surprises. </blockquote> <pre class="prettyprint"><code>In [260]: X[np.ix_(mask1, mask2)] Out[260]: array([[1, 2], [4, 5]]) In [261]: np.ix_(mask1, mask2) Out[261]: (array([[0], [1]], dtype=int32), array([[0, 1]], dtype=int32)) </code></pre> <hr> The boolean section of <code>ix_</code>: <pre class="prettyprint"><code> if issubdtype(new.dtype, _nx.bool_): new, = new.nonzero() </code></pre> So it works with a mix like <code>X[np.ix_(mask1, [0,2])]</code>

One solution would be to use sequential integer indexing and getting the integers for example from <code>np.where</code>: <pre class="prettyprint"><code>>>> X[:, np.where(mask1)[0]][np.where(mask2)[0]] array([[1, 2], [4, 5]]) </code></pre> or as @user2357112 pointed out in the comments <code>np.ix_</code> could be used as well. For example: <pre class="prettyprint"><code>>>> X[np.ix_(np.where(mask1)[0], np.where(mask2)[0])] array([[1, 2], [4, 5]]) </code></pre> <hr> Another idea would be to broadcast your masks and then do it in one step would require a reshape afterwards: <pre class="prettyprint"><code>>>> X[np.where(mask1[:, None] * mask2)] array([1, 2, 4, 5]) >>> X[np.where(mask1[:, None] * mask2)].reshape(2, 2) array([[1, 2], [4, 5]]) </code></pre>

Boolean masking on multiple axes with numpy

Tags:

python

numpy

I want to apply boolean masking both to rows and columns.

With

X = np.array([[1,2,3],[4,5,6]])
mask1 = np.array([True, True])
mask2 = np.array([True, True, False])
X[mask1, mask2]

I expect the output to be

array([[1,2],[4,5]])

instead of

array([1,5])

It's known that

X[:, mask2]

can be used here but that's not a solution for the general case.

I would like to know how it works under the hood and why in this case the result is array([1,5]).

321

asked Feb 18 '17 00:02

tarashypka

2 Answers

X[mask1, mask2] is described in Boolean Array Indexing Doc as the equivalent of

In [249]: X[mask1.nonzero()[0], mask2.nonzero()[0]]
Out[249]: array([1, 5])
In [250]: X[[0,1], [0,1]]
Out[250]: array([1, 5])

In effect it is giving you X[0,0] and X[1,1] (pairing the 0s and 1s).

What you want instead is:

In [251]: X[[[0],[1]], [0,1]]
Out[251]: 
array([[1, 2],
       [4, 5]])

np.ix_ is a handy tool for creating the right mix of dimensions

In [258]: np.ix_([0,1],[0,1])
Out[258]: 
(array([[0],
        [1]]), array([[0, 1]]))
In [259]: X[np.ix_([0,1],[0,1])]
Out[259]: 
array([[1, 2],
       [4, 5]])

That's effectively a column vector for the 1st axis and row vector for the second, together defining the desired rectangle of values.

But trying to broadcast boolean arrays like this does not work: X[mask1[:,None], mask2]

But that reference section says:

Combining multiple Boolean indexing arrays or a Boolean with an integer indexing array can best be understood with the obj.nonzero() analogy. The function ix_ also supports boolean arrays and will work without any surprises.

In [260]: X[np.ix_(mask1, mask2)]
Out[260]: 
array([[1, 2],
       [4, 5]])
In [261]: np.ix_(mask1, mask2)
Out[261]: 
(array([[0],
        [1]], dtype=int32), array([[0, 1]], dtype=int32))

The boolean section of ix_:

    if issubdtype(new.dtype, _nx.bool_):
        new, = new.nonzero()

So it works with a mix like X[np.ix_(mask1, [0,2])]

174

answered Nov 02 '22 10:11

hpaulj

One solution would be to use sequential integer indexing and getting the integers for example from np.where:

>>> X[:, np.where(mask1)[0]][np.where(mask2)[0]]
array([[1, 2],
       [4, 5]])

or as @user2357112 pointed out in the comments np.ix_ could be used as well. For example:

>>> X[np.ix_(np.where(mask1)[0], np.where(mask2)[0])]
array([[1, 2],
       [4, 5]])

Another idea would be to broadcast your masks and then do it in one step would require a reshape afterwards:

>>> X[np.where(mask1[:, None] * mask2)]
array([1, 2, 4, 5])

>>> X[np.where(mask1[:, None] * mask2)].reshape(2, 2)
array([[1, 2],
       [4, 5]])

answered Nov 02 '22 10:11

MSeifert

Related questions
                            
                                Add Google sheet with data using Google API v4
                            
                                Determine length of keypress in python
                            
                                What is the most pythonic way to reuse data in multiple calls to same function?
                            
                                How to perform deconvolution in Keras/ Theano?
                            
                                Django doesn't see environment variables when deployed to Elastic Beanstalk
                            
                                TypeError: It would appear that nargs is set to conflict with the composite type arity
                            
                                sort_values versus sort giving different answers, but sort_values is the correct answer
                            
                                RQScheduler on Heroku
                            
                                Python: Propagate an exception through a Try/Except Block with multiple Excepts
                            
                                Using concurrent.futures to consume many dequeued messages a time
                            
                                Touchbar support in python
                            
                                Chromedriver error on Linux
                            
                                pandas groupby offsets different start
                            
                                TypeError: 'Tensor' object cannot be interpreted as an integer
                            
                                How to snap to grid a QGraphicsTextItem?
                            
                                imap vs. map in grequests library
                            
                                What does R0902 of Pylint mean? Why do we have this limit?
                            
                                Building an executable application for windows .exe on mac
                            
                                numpy argmax with max less than some number
                            
                                How can I change this code to use context managers?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With