I try to retrieve percentiles from an array with NoData values. In my case the Nodata values are represented by -3.40282347e+38. I thought a masked array would exclude this values from further calculations. I succesfully create the masked array but for the np.percentile() function the mask has no effect.
>>> DataArray = np.array(data)
>>> DataArray
([[ value, value...]], dtype=float32)
>>> masked_data = ma.masked_where(DataArray < 0, DataArray)
>>> p5 = np.percentile(masked_data, 5)
>>> print p5
-3.40282347e+38
If you fill your masked values as np.nan
, you could then use np.nanpercentile
import numpy as np
data = np.arange(-5.5,10.5) # Note that you need a non-integer array to store NaN
mdata = np.ma.masked_where(data < 0, data)
mdata = np.ma.filled(mdata, np.nan)
np.nanpercentile(mdata, 50) # 50th percentile
Looking at the np.percentile
code it is clear it does nothing special with masked arrays.
def percentile(a, q, axis=None, out=None,
overwrite_input=False, interpolation='linear', keepdims=False):
q = array(q, dtype=np.float64, copy=True)
r, k = _ureduce(a, func=_percentile, q=q, axis=axis, out=out,
overwrite_input=overwrite_input,
interpolation=interpolation)
if keepdims:
if q.ndim == 0:
return r.reshape(k)
else:
return r.reshape([len(q)] + k)
else:
return r
Where _ureduce
and _percentile
are internal functions defined in numpy/lib/function_base.py
. So the real action is more complex.
Masked arrays have 2 strategies for using numpy functions. One is to fill
- replace the masked values with innocuous ones, for example 0 when doing sum, 1 when doing a product. The other is to compress
the data - that is, remove all masked values.
for example:
In [997]: data=np.arange(-5,10)
In [998]: mdata=np.ma.masked_where(data<0,data)
In [1001]: np.ma.filled(mdata,0)
Out[1001]: array([0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [1002]: np.ma.filled(mdata,1)
Out[1002]: array([1, 1, 1, 1, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [1008]: mdata.compressed()
Out[1008]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
Which is going to give you the desired percentile
? Filling or compressing? Or none. You need to understand the concept of percentile well enough to know how it should apply in the case of your masked values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With