I'm a little confused about the output of numpy.median in the case of masked arrays. Here is a simple example (assuming numpy is imported - I have version 1.6.2):
>>> a = [3.0, 4.0, 5.0, 6.0, numpy.nan]
>>> am = numpy.ma.masked_array(a, [numpy.isnan(x) for x in a])
I'd like to be able to use the masked array to ignore nan
values in the array when calculating the median. This works for mean using either numpy.mean
or the mean()
method of the masked array:
>>> numpy.mean(a)
nan
>>> numpy.mean(am)
4.5
>>> am.mean()
4.5
However for median I get:
>>> numpy.median(am)
5.0
but I'd expect something more like this result:
>>> numpy.median([x for x in a if not numpy.isnan(x)])
4.5
and unfortunately a masked_array
does not have a median
method.
Get Median Value of Array Along with axis axis=0 , and to get the median values of each column use axis=1 . In the following example, I have demonstrated these two examples. # Use numpy median() along axis = 0 # Get the median value of row arr1 = np. median(arr, axis = 0) print(arr1) # Output # [ 8.
A masked array is the combination of a standard numpy. ndarray and a mask. A mask is either nomask , indicating that no value of the associated array is invalid, or an array of booleans that determines for each element of the associated array whether the value is valid or not.
According to the documentation of numpy. median , you don't have to manually sort the data before feeding it to the function, as it does this internally. It is actually very good practice to view the source-code of the function, and try to understand how it works.
Use np.ma.median
on a MaskedArray
.
[Explanation: If I remember correctly, the np.median
does not support subclasses, so it fails to work correctly on np.ma.MaskedArray
.]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With