I've read the masked array documentation several times now, searched everywhere and feel thoroughly stupid. I can't figure out for the life in me how to apply a mask from one array to another.
Example:
import numpy as np y = np.array([2,1,5,2]) # y axis x = np.array([1,2,3,4]) # x axis m = np.ma.masked_where(y>2, y) # filter out values larger than 5 print m [2 1 -- 2] print np.ma.compressed(m) [2 1 2]
So this works fine.... but to plot this y axis, I need a matching x axis. How do I apply the mask from the y array to the x array? Something like this would make sense, but produces rubbish:
new_x = x[m.mask].copy() new_x array([5])
So, how on earth is that done (note the new x array needs to be a new array).
Edit:
Well, it seems one way to do this works like this:
>>> import numpy as np >>> x = np.array([1,2,3,4]) >>> y = np.array([2,1,5,2]) >>> m = np.ma.masked_where(y>2, y) >>> new_x = np.ma.masked_array(x, m.mask) >>> print np.ma.compressed(new_x) [1 2 4]
But that's incredibly messy! I'm trying to find a solution as elegant as IDL...
To combine two masks with the logical_or operator, use the mask_or() method in Python Numpy. If copy parameter is False and one of the inputs is nomask, return a view of the other input mask. Defaults to False. The shrink parameter suggests whether to shrink the output to nomask if all its values are False.
To create a boolean mask from an array, use the ma. make_mask() method in Python Numpy. The function can accept any sequence that is convertible to integers, or nomask. Does not require that contents must be 0s and 1s, values of 0 are interpreted as False, everything else as True.
Definition. Matrix Masking refers to a class of statistical disclosure limitation (SDL) methods used to protect confidentiality of statistical data, transforming an n × p (cases by variables) data matrix Z through pre- and post-multiplication and the possible addition of noise.
A boolean array can be created manually by using dtype=bool when creating the array. Values other than 0 , None , False or empty strings are considered True. Alternatively, numpy automatically creates a boolean array when comparisons are made between arrays and scalars or between arrays of the same shape.
I had a similar issue, but involving loads more masking commands and more arrays to apply them. My solution is that I do all the masking on one array and then use the finally masked array as the condition in the mask_where
command.
For example:
y = np.array([2,1,5,2]) # y axis x = np.array([1,2,3,4]) # x axis m = np.ma.masked_where(y>5, y) # filter out values larger than 5 new_x = np.ma.masked_where(np.ma.getmask(m), x) # applies the mask of m on x
The nice thing is you can now apply this mask to many more arrays without going through the masking process for each of them.
Why not simply
import numpy as np y = np.array([2,1,5,2]) # y axis x = np.array([1,2,3,4]) # x axis m = np.ma.masked_where(y>2, y) # filter out values larger than 5 print list(m) print np.ma.compressed(m) # mask x the same way m_ = np.ma.masked_where(y>2, x) # filter out values larger than 5 # print here the list print list(m_) print np.ma.compressed(m_)
code is for Python 2.x
Also, as proposed by joris, this do the work new_x = x[~m.mask].copy()
giving an array
>>> new_x array([1, 2, 4])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With