I have 2D numpy array something like this:
arr = np.array([[1,2,4],
                [2,1,1],
                [1,2,3]])
and a boolean array:
boolarr = np.array([[True, True, False],
                    [False, False, True],
                    [True, True,True]])
Now, when I try to slice arr based on boolarr, it gives me
arr[boolarr]
Output:
array([1, 2, 1, 1, 2, 3])
But I am looking to have a 2D array output instead. The desired output is
[[1, 2],
 [1],
 [1, 2, 3]]
                An option using numpy is to start by adding up rows in the mask:
take = boolarr.sum(axis=1)
#array([2, 1, 3])
Then mask the array as you do:
x = arr[boolarr]
#array([1, 2, 1, 1, 2, 3])
And use np.split to split the flat array according to the np.cumsum of take (as the function expects the indices where to split the array):
np.split(x, np.cumsum(take)[:-1])
[array([1, 2]), array([1]), array([1, 2, 3])]
General solution
def mask_nd(x, m):
    '''
    Mask a 2D array and preserve the
    dimension on the resulting array
    ----------
    x: np.array
       2D array on which to apply a mask
    m: np.array
        2D boolean mask  
    Returns
    -------
    List of arrays. Each array contains the
    elements from the rows in x once masked.
    If no elements in a row are selected the 
    corresponding array will be empty
    '''
    take = m.sum(axis=1)
    return np.split(x[m], np.cumsum(take)[:-1])
Examples
Lets have a look at some examples:
arr = np.array([[1,2,4],
                [2,1,1],
                [1,2,3]])
boolarr = np.array([[True, True, False],
                    [False, False, False],
                    [True, True,True]])
mask_nd(arr, boolarr)
# [array([1, 2]), array([], dtype=int32), array([1, 2, 3])]
Or for the following arrays:
arr = np.array([[1,2],
                [2,1]])
boolarr = np.array([[True, True],
                    [True, False]])
mask_nd(arr, boolarr)
# [array([1, 2]), array([2])]
                        Your desired output is not a 2D array, since each "row" has a different number of "columns". A functional non-vectorised solution is possible via itertools.compress:
from itertools import compress
res = list(map(list, map(compress, arr, boolarr)))
# [[1, 2], [1], [1, 2, 3]]
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With