I'm working with masked arrays thanks to some of the help I've gotten on stackoverflow, but I'm running into a problem with the np.where evaluation of a masked array.
My masked array is:
m_pt0 = np.ma.masked_array([1, 2, 3, 0, 4, 7, 6, 5],
mask=[False, True, False, False,
False, False, False, False])
And prints like this:
In [24]: print(m_pt0)
[1 -- 3 0 4 7 6 5]
And I'm looking for the index in m_pt0 where m_pt0 = 0, I would expect that
np.where(0 == m_pt0)
would return:
(array([3]))
However, despite the mask (or because of?), I instead get
(array([1, 3]),)
The entire point of using the mask is to avoid accessing indices that are "blank", so how can I use where (or another function) to only retrieve the indices that are unmasked and match my boolean criteria.
You need to use the masked variant of the where() function, otherwise it will return wrong or unwanted results for masked arrays. The same goes for other functions, like polyfit().
I. e.:
In [2]: np.ma.where(0 == m_pt0)
Out[2]: (array([3]),)
The equality test may create confusion. The result is another masked array:
In [19]: 0 == m_pt0
Out[19]:
masked_array(data = [False -- False True False False False False],
mask = [False True False False False False False False],
fill_value = True)
A masked array has .data and .mask attributes. numpy functions that aren't MA aware just see the .data:
In [20]: _.data
Out[20]: array([False, True, False, True, False, False, False, False], dtype=bool)
np.where sees the 2 True, and returns
In [23]: np.where(0 == m_pt0)
Out[23]: (array([1, 3], dtype=int32),)
In [24]: np.where((0 == m_pt0).data)
Out[24]: (array([1, 3], dtype=int32),)
Where possible it is better to use the np.ma version of a function:
In [25]: np.ma.where(0 == m_pt0)
Out[25]: (array([3], dtype=int32),)
Looking at the code for np.source(np.ma.where) I see it does
if missing == 2:
return filled(condition, 0).nonzero()
(plus lots of code for the 3 argument use)
That filled does:
In [27]: np.ma.filled((0 == m_pt0),0)
Out[27]: array([False, False, False, True, False, False, False, False], dtype=bool)
MA functions often replace the masked values with something innocuous (0 in this case), or use compressed to remove them from consideration.
In [36]: m_pt0.compressed()
Out[36]: array([1, 3, 0, 4, 7, 6, 5])
In [37]: m_pt0.filled(100)
Out[37]: array([ 1, 100, 3, 0, 4, 7, 6, 5])
A numpy function will work correctly on a MA if it delegates the work to the array's own method.
In [41]: np.nonzero(m_pt0)
Out[41]: (array([0, 2, 4, 5, 6, 7], dtype=int32),)
In [42]: m_pt0.nonzero()
Out[42]: (array([0, 2, 4, 5, 6, 7], dtype=int32),)
In [43]: np.where(m_pt0)
Out[43]: (array([0, 1, 2, 4, 5, 6, 7], dtype=int32),)
np.nonzero delegates. np.where does not.
The repr of a masked array shows the mask. Its str just shows the masked data:
In [31]: m_pt0
Out[31]:
masked_array(data = [1 -- 3 0 4 7 6 5],
mask = [False True False False False False False False],
fill_value = 999999)
In [32]: str(m_pt0)
Out[32]: '[1 -- 3 0 4 7 6 5]'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With