I'm working with masked arrays thanks to some of the help I've gotten on stackoverflow, but I'm running into a problem with the np.where evaluation of a masked array.
My masked array is:
m_pt0 = np.ma.masked_array([1, 2, 3, 0, 4, 7, 6, 5],
mask=[False, True, False, False,
False, False, False, False])
And prints like this:
In [24]: print(m_pt0)
[1 -- 3 0 4 7 6 5]
And I'm looking for the index in m_pt0 where m_pt0 = 0, I would expect that
np.where(0 == m_pt0)
would return:
(array([3]))
However, despite the mask (or because of?), I instead get
(array([1, 3]),)
The entire point of using the mask is to avoid accessing indices that are "blank", so how can I use where (or another function) to only retrieve the indices that are unmasked and match my boolean criteria.
You need to use the masked variant of the where()
function, otherwise it will return wrong or unwanted results for masked arrays. The same goes for other functions, like polyfit()
.
I. e.:
In [2]: np.ma.where(0 == m_pt0)
Out[2]: (array([3]),)
The equality test may create confusion. The result is another masked array:
In [19]: 0 == m_pt0
Out[19]:
masked_array(data = [False -- False True False False False False],
mask = [False True False False False False False False],
fill_value = True)
A masked array has .data
and .mask
attributes. numpy
functions that aren't MA aware just see the .data
:
In [20]: _.data
Out[20]: array([False, True, False, True, False, False, False, False], dtype=bool)
np.where
sees the 2 True
, and returns
In [23]: np.where(0 == m_pt0)
Out[23]: (array([1, 3], dtype=int32),)
In [24]: np.where((0 == m_pt0).data)
Out[24]: (array([1, 3], dtype=int32),)
Where possible it is better to use the np.ma
version of a function:
In [25]: np.ma.where(0 == m_pt0)
Out[25]: (array([3], dtype=int32),)
Looking at the code for np.source(np.ma.where)
I see it does
if missing == 2:
return filled(condition, 0).nonzero()
(plus lots of code for the 3 argument use)
That filled
does:
In [27]: np.ma.filled((0 == m_pt0),0)
Out[27]: array([False, False, False, True, False, False, False, False], dtype=bool)
MA
functions often replace the masked values with something innocuous (0 in this case), or use compressed
to remove them from consideration.
In [36]: m_pt0.compressed()
Out[36]: array([1, 3, 0, 4, 7, 6, 5])
In [37]: m_pt0.filled(100)
Out[37]: array([ 1, 100, 3, 0, 4, 7, 6, 5])
A numpy function will work correctly on a MA if it delegates the work to the array's own method.
In [41]: np.nonzero(m_pt0)
Out[41]: (array([0, 2, 4, 5, 6, 7], dtype=int32),)
In [42]: m_pt0.nonzero()
Out[42]: (array([0, 2, 4, 5, 6, 7], dtype=int32),)
In [43]: np.where(m_pt0)
Out[43]: (array([0, 1, 2, 4, 5, 6, 7], dtype=int32),)
np.nonzero
delegates. np.where
does not.
The repr
of a masked array shows the mask. Its str
just shows the masked data:
In [31]: m_pt0
Out[31]:
masked_array(data = [1 -- 3 0 4 7 6 5],
mask = [False True False False False False False False],
fill_value = 999999)
In [32]: str(m_pt0)
Out[32]: '[1 -- 3 0 4 7 6 5]'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With