Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas rolling apply with missing data

I want to do a rolling computation on missing data.

Sample Code: (For sake of simplicity I'm giving an example of a rolling sum but I want to do something more generic.)

foo = lambda z: z[pandas.notnull(z)].sum() 
x = np.arange(10, dtype="float")    
x[6] = np.NaN
x2 = pandas.Series(x)    
pandas.rolling_apply(x2, 3, foo)

which produces:

0   NaN    
1   NaN
2     3    
3     6    
4     9    
5    12    
6   NaN    
7   NaN    
8   NaN    
9    24

I think that during the "rolling", window with missing data is being ignored for computation. I'm looking to get a result along the lines of:

0   NaN    
1   NaN    
2     3    
3     6    
4     9    
5    12    
6     9    
7    12    
8    15    
9    24
like image 691
Mahesh Avatar asked Nov 15 '12 20:11

Mahesh


1 Answers

In [7]: pandas.rolling_apply(x2, 3, foo, min_periods=2)
Out[7]: 
0   NaN
1     1
2     3
3     6
4     9
5    12
6     9
7    12
8    15
9    24
like image 121
Chang She Avatar answered Nov 06 '22 01:11

Chang She