I have two NumPy arrays a
, b
with dimensions m
by n
. I have a Boolean vector b
of length n
and I want to produce a new array c
, which selects the n
columns from a
, b
, so that if b[i]
is true, I take the column from b
otherwise from a
.
How do I do this in the most efficient way possible?
I've looked at select
, where
and choose
.
First off, let's set up some example code:
import numpy as np
m, n = 5, 3
a = np.zeros((m, n))
b = np.ones((m, n))
boolvec = np.random.randint(0, 2, m).astype(bool)
Just to show what this data might look like:
In [2]: a
Out[2]:
array([[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]])
In [3]: b
Out[3]:
array([[ 1., 1., 1.],
[ 1., 1., 1.],
[ 1., 1., 1.],
[ 1., 1., 1.],
[ 1., 1., 1.]])
In [4]: boolvec
Out[4]: array([ True, True, False, False, False], dtype=bool)
In this case, it's most efficient to use np.where
for this. However, we need boolvec
to be of a shape that can broadcast to the same shape as a
and b
. Therefore, we can make it a column vector by slicing with np.newaxis
or None
(they're the same):
In [5]: boolvec[:,None]
Out[5]:
array([[ True],
[ True],
[False],
[False],
[False]], dtype=bool)
And then we can make the final result using np.where
:
In [6]: c = np.where(boolvec[:, None], a, b)
In [7]: c
Out[7]:
array([[ 0., 0., 0.],
[ 0., 0., 0.],
[ 1., 1., 1.],
[ 1., 1., 1.],
[ 1., 1., 1.]])
You could use np.choose
for this.
For example a
and b
arrays:
>>> a = np.arange(12).reshape(3,4)
>>> b = np.arange(12).reshape(3,4) + 100
>>> a_and_b = np.array([a, b])
To use np.choose
, we want a 3D array with both arrays; a_and_b
looks like this:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[100, 101, 102, 103],
[104, 105, 106, 107],
[108, 109, 110, 111]]])
Now let the Boolean array be bl = np.array([0, 1, 1, 0])
. Then:
>>> np.choose(bl, a_and_b)
array([[ 0, 101, 102, 3],
[ 4, 105, 106, 7],
[ 8, 109, 110, 11]])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With