I have 2D numpy array something like this:
arr = np.array([[1,2,4],
[2,1,1],
[1,2,3]])
and a boolean array:
boolarr = np.array([[True, True, False],
[False, False, True],
[True, True,True]])
Now, when I try to slice arr based on boolarr, it gives me
arr[boolarr]
Output:
array([1, 2, 1, 1, 2, 3])
But I am looking to have a 2D array output instead. The desired output is
[[1, 2],
[1],
[1, 2, 3]]
An option using numpy
is to start by adding up rows in the mask
:
take = boolarr.sum(axis=1)
#array([2, 1, 3])
Then mask the array as you do:
x = arr[boolarr]
#array([1, 2, 1, 1, 2, 3])
And use np.split
to split the flat array according to the np.cumsum
of take
(as the function expects the indices where to split the array):
np.split(x, np.cumsum(take)[:-1])
[array([1, 2]), array([1]), array([1, 2, 3])]
General solution
def mask_nd(x, m):
'''
Mask a 2D array and preserve the
dimension on the resulting array
----------
x: np.array
2D array on which to apply a mask
m: np.array
2D boolean mask
Returns
-------
List of arrays. Each array contains the
elements from the rows in x once masked.
If no elements in a row are selected the
corresponding array will be empty
'''
take = m.sum(axis=1)
return np.split(x[m], np.cumsum(take)[:-1])
Examples
Lets have a look at some examples:
arr = np.array([[1,2,4],
[2,1,1],
[1,2,3]])
boolarr = np.array([[True, True, False],
[False, False, False],
[True, True,True]])
mask_nd(arr, boolarr)
# [array([1, 2]), array([], dtype=int32), array([1, 2, 3])]
Or for the following arrays:
arr = np.array([[1,2],
[2,1]])
boolarr = np.array([[True, True],
[True, False]])
mask_nd(arr, boolarr)
# [array([1, 2]), array([2])]
Your desired output is not a 2D array, since each "row" has a different number of "columns". A functional non-vectorised solution is possible via itertools.compress
:
from itertools import compress
res = list(map(list, map(compress, arr, boolarr)))
# [[1, 2], [1], [1, 2, 3]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With