How to calculate mean value of an array (A) avoiding nan?
import numpy as np A = [5 nan nan nan nan 10] M = np.mean(A[A!=nan]) does not work Any idea?
nanmean() function can be used to calculate the mean of array ignoring the NaN value. If array have NaN value and we can find out the mean without effect of NaN value. axis: we can use axis=1 means row wise or axis=0 means column wise.
Python NumPy nanmean() function is used to compute the arithmetic mean or average of the array ignoring the NaN value. If the array has a NaN value and we can find out the average without being influenced by the NaN value. The mean/average is taken over the flattened array by default, otherwise over the specified axis.
Nan is returned for slices that contain only NaNs. The arithmetic mean is the sum of the non-NaN elements along the axis divided by the number of non-NaN elements. Note that for floating-point input, the mean is computed using the same precision the input has.
NaN stands for Not A Number and is one of the common ways to represent the missing value in the data. It is a special floating-point value and cannot be converted to any other type than float.
An other possibility is the following:
import numpy from scipy.stats import nanmean # nanmedian exists too, if you need it A = numpy.array([5, numpy.nan, numpy.nan, numpy.nan, numpy.nan, 10]) print nanmean(A) # gives 7.5 as expected
i guess this looks more elegant (and readable) than the other solution already given
edit: apparently (@Jaime) reports that this functionality already exists directly in the latest numpy
(1.8) as well, so no need to import scipy.stats
anymore if you have that version of numpy
:
import numpy A = numpy.array([5, numpy.nan, numpy.nan, numpy.nan, numpy.nan, 10]) print numpy.nanmean(A)
the first solution works also for people who dont have the latest version of numpy
(like me)
Use numpy.isnan
:
>>> import numpy as np >>> A = np.array([5, np.nan, np.nan, np.nan, np.nan, 10]) >>> np.isnan(A) array([False, True, True, True, True, False], dtype=bool) >>> ~np.isnan(A) array([ True, False, False, False, False, True], dtype=bool) >>> A[~np.isnan(A)] array([ 5., 10.]) >>> A[~np.isnan(A)].mean() 7.5
because you cannot compare nan
with nan
:
>>> np.nan == np.nan False >>> np.nan != np.nan True >>> np.isnan(np.nan) True
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With