Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python numpy masked array initialization

I used masked arrays all the time in my work, but one problem I have is that the initialization of masked arrays is a bit clunky. Specifically, the ma.zeros() and ma.empty() return masked arrays with a mask that doesn't match the array dimension. The reason I want this is so that if I don't assign to a particular element of my array, it is masked by default.

In [4]: A=ma.zeros((3,))
...
masked_array(data = [ 0.  0.  0.],
             mask = False,
       fill_value = 1e+20)

I can subsequently assign the mask:

In [6]: A.mask=ones((3,))
...
masked_array(data = [-- -- --],
             mask = [ True  True  True],
       fill_value = 1e+20)

But why should I have to use two lines to initialize and array? Alternatively, I can ignore the ma.zeros() functionality and specify the mask and data in one line:

In [8]: A=ma.masked_array(zeros((3,)),mask=ones((3,)))
...
masked_array(data = [-- -- --],
             mask = [ True  True  True],
       fill_value = 1e+20)

But I think this is also clunky. I have trawled through the numpy.ma documentation but I can't find a neat way of dealing with this. Have I missed something obvious?

like image 648
Thom Chubb Avatar asked Oct 06 '22 00:10

Thom Chubb


1 Answers

Well, the mask in ma.zeros is actually a special constant, ma.nomask, that corresponds to np.bool_(False). It's just a placeholder telling NumPy that the mask hasn't been set. Using nomask actually speeds up np.ma significantly: no need to keep track of where the masked values are if we know beforehand that there are none.

The best approach is not to set your mask explicitly if you don't need it and leave np.ma set it when needed (ie, when you end up trying to take the log of a negative number).


Side note #1: to set the mask to an array of False with the same shape as your input, use

np.ma.array(..., mask=False)

That's easier to type. Note that it's really the Python False, not np.ma.nomask... Similarly, use mask=True to force all your inputs to be masked (ie, mask will be a bool ndarray full of True, with the same shape as the data).


Side note #2: If you need to set the mask after initialization, you shouldn't use an assignment to .mask but assign to the special value np.ma.masked, it's safer:

a[:] = np.ma.masked
like image 129
Pierre GM Avatar answered Oct 10 '22 03:10

Pierre GM