Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to properly mask a numpy 2D array?

Say I have a two dimensional array of coordinates that looks something like

x = array([[1,2],[2,3],[3,4]])

Previously in my work so far, I generated a mask that ends up looking something like

mask = [False,False,True]

When I try to use this mask on the 2D coordinate vector, I get an error

newX = np.ma.compressed(np.ma.masked_array(x,mask))  >>>numpy.ma.core.MaskError: Mask and data not compatible: data size     is 6, mask size is 3.` 

which makes sense, I suppose. So I tried to simply use the following mask instead:

mask2 = np.column_stack((mask,mask)) newX = np.ma.compressed(np.ma.masked_array(x,mask2)) 

And what I get is close:

>>>array([1,2,2,3])

to what I would expect (and want):

>>>array([[1,2],[2,3]])

There must be an easier way to do this?

like image 759
pretzlstyle Avatar asked Jul 05 '16 01:07

pretzlstyle


People also ask

How do you mask a 2D array in Python?

To mask rows and/or columns of a 2D array that contain masked values, use the np. ma. mask_rowcols() method in Numpy. The function returns a modified version of the input array, masked depending on the value of the axis parameter.

What is a masked array in Numpy?

A masked array is the combination of a standard numpy. ndarray and a mask. A mask is either nomask , indicating that no value of the associated array is invalid, or an array of booleans that determines for each element of the associated array whether the value is valid or not.


Video Answer


2 Answers

Is this what you are looking for?

import numpy as np x[~np.array(mask)] # array([[1, 2], #        [2, 3]]) 

Or from numpy masked array:

newX = np.ma.array(x, mask = np.column_stack((mask, mask))) newX  # masked_array(data = #  [[1 2] #  [2 3] #  [-- --]], #              mask = #  [[False False] #  [False False] #  [ True  True]], #        fill_value = 999999) 
like image 196
Psidom Avatar answered Oct 06 '22 10:10

Psidom


Your x is 3x2:

In [379]: x Out[379]:  array([[1, 2],        [2, 3],        [3, 4]]) 

Make a 3 element boolean mask:

In [380]: rowmask=np.array([False,False,True]) 

That can be used to select the rows where it is True, or where it is False. In both cases the result is 2d:

In [381]: x[rowmask,:] Out[381]: array([[3, 4]])  In [382]: x[~rowmask,:] Out[382]:  array([[1, 2],        [2, 3]]) 

This is without using the MaskedArray subclass. To make such array, we need a mask that matches x in shape. There isn't provision for masking just one dimension.

In [393]: xmask=np.stack((rowmask,rowmask),-1)  # column stack  In [394]: xmask Out[394]:  array([[False, False],        [False, False],        [ True,  True]], dtype=bool)  In [395]: np.ma.MaskedArray(x,xmask) Out[395]:  masked_array(data =  [[1 2]  [2 3]  [-- --]],              mask =  [[False False]  [False False]  [ True  True]],        fill_value = 999999) 

Applying compressed to that produces a raveled array: array([1, 2, 2, 3])

Since masking is element by element, it could mask one element in row 1, 2 in row 2 etc. So in general compressing, removing the masked elements, will not yield a 2d array. The flattened form is the only general choice.

np.ma makes most sense when there's a scattering of masked values. It isn't of much value if you want want to select, or deselect, whole rows or columns.

===============

Here are more typical masked arrays:

In [403]: np.ma.masked_inside(x,2,3) Out[403]:  masked_array(data =  [[1 --]  [-- --]  [-- 4]],              mask =  [[False  True]  [ True  True]  [ True False]],        fill_value = 999999)  In [404]: np.ma.masked_equal(x,2) Out[404]:  masked_array(data =  [[1 --]  [-- 3]  [3 4]],              mask =  [[False  True]  [ True False]  [False False]],        fill_value = 2)  In [406]: np.ma.masked_outside(x,2,3) Out[406]:  masked_array(data =  [[-- 2]  [2 3]  [3 --]],              mask =  [[ True False]  [False False]  [False  True]],        fill_value = 999999) 
like image 39
hpaulj Avatar answered Oct 06 '22 11:10

hpaulj