Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas rolling apply to allow nan

I have a very simple Pandas Series:

xx = pd.Series([1, 2, np.nan, np.nan, 3, 4, 5])

If I run this I get what I want:

>>> xx.rolling(3,1).mean()
0    1.0
1    1.5
2    1.5
3    2.0
4    3.0
5    3.5
6    4.0

But if I have to use .apply() I cannot get it to work by ignoring NaNs in the mean() operation:

>>> xx.rolling(3,1).apply(np.mean)
0    1.0
1    1.5
2    NaN
3    NaN
4    NaN
5    NaN
6    4.0

>>> xx.rolling(3,1).apply(lambda x : np.mean(x))
0    1.0
1    1.5
2    NaN
3    NaN
4    NaN
5    NaN
6    4.0

What should I do in order to both use .apply() and have the result in the first output? My actual problem is more complicated that I have to use .apply() to realize but it boils down to this issue.

like image 951
Zhang18 Avatar asked Jun 06 '17 23:06

Zhang18


People also ask

What is Min_periods in rolling?

The min_periods argument specifies the minimum number of observations in the current window required to generate a rolling value; otherwise, the result is NaN .

How do you filter out NaN values pandas?

You can filter out rows with NAN value from pandas DataFrame column string, float, datetime e.t.c by using DataFrame. dropna() and DataFrame. notnull() methods. Python doesn't support Null hence any missing data is represented as None or NaN.


1 Answers

You can use np.nanmean()

xx.rolling(3,1).apply(lambda x : np.nanmean(x))
Out[59]: 
0    1.0
1    1.5
2    1.5
3    2.0
4    3.0
5    3.5
6    4.0
dtype: float64

If you have to process the nans explicitly, you can do:

xx.rolling(3,1).apply(lambda x : np.mean(x[~np.isnan(x)]))
Out[94]: 
0    1.0
1    1.5
2    1.5
3    2.0
4    3.0
5    3.5
6    4.0
dtype: float64
like image 67
Allen Avatar answered Oct 08 '22 06:10

Allen