Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

apply a function on rolling window in Dataframe where whole dataframe is passed to function

I have a dataframe of 5 columns indexed by YearMo:

yearmo = np.repeat(np.arange(2000, 2010) * 100, 12) + [x for x in range(1,13)] * 10 
rates = pd.DataFrame(data=np.random.random(120, 5)), 
                     index=pd.Series(data=yearmo, name='YearMo'), 
                     columns=['A', 'B','C', 'D', 'E'])

rates.head()                       
YearMo    A         B          C         D       E 
200411  0.237696  0.341937  0.258713  0.569689  0.470776
200412  0.601713  0.313006  0.221821  0.720162  0.889891
200501  0.024379  0.761315  0.225032  0.293682  0.302431
200502  0.996778  0.388783  0.026448  0.056188  0.744850
200503  0.942024  0.768416  0.484236  0.102904  0.287446

What I would like to do is to be able to apply a rolling window and pass all five columns to a function – something like:

rates.rolling(window=60, min_periods=60).apply(lambda x: my_func(data=x, param=5)

but this approach applies the function to each column. Specifying axis=1 doesn't do anything either....

like image 762
naveendaftari Avatar asked Apr 15 '17 00:04

naveendaftari


1 Answers

Question: ... apply a rolling window and pass all five columns to a function

This will do what you want, min_periods=5, axis=1. .rolling(... window is column 'A':'E' or a multiple of 5.

def f1(data=None):
    print('f1(%s, %s) data=%s' % (str(type(data)), param, data))
    return data.sum()

subRates = rates.rolling(window=60, min_periods=5, axis=1).apply(lambda x: f1( x ) )

Input:

               A         B         C         D         E
YearMo
200001  0.666744  0.569194  0.546873  0.018696  0.240783
200002  0.035888  0.853077  0.348200  0.921997  0.283177
200003  0.652761  0.076630  0.298076  0.800504  0.041231
200004  0.537397  0.968399  0.211072  0.328157  0.929783
200005  0.759506  0.702220  0.807477  0.886935  0.022587

Output:

f1(<class 'numpy.ndarray'>, None) data=[ 0.66674393  0.56919434  0.54687296  0.01869609  0.24078329]
f1(<class 'numpy.ndarray'>, None) data=[ 0.03588751  0.85307707  0.34819965  0.92199698  0.28317727]
f1(<class 'numpy.ndarray'>, None) data=[ 0.65276067  0.07663029  0.29807589  0.80050448  0.04123137]
f1(<class 'numpy.ndarray'>, None) data=[ 0.53739687  0.96839917  0.21107155  0.32815687  0.92978308]
f1(<class 'numpy.ndarray'>, None) data=[ 0.75950632  0.70222034  0.80747698  0.88693524  0.02258685]
         A   B   C   D         E
YearMo
200001 NaN NaN NaN NaN  2.042291
200002 NaN NaN NaN NaN  2.442338
200003 NaN NaN NaN NaN  1.869203
200004 NaN NaN NaN NaN  2.974808
200005 NaN NaN NaN NaN  3.178726

Tested with Python:3.4.2 - pandas:0.19.2

like image 84
stovfl Avatar answered Sep 27 '22 23:09

stovfl