I am trying to apply a rolling function, with a 3 year window, on a pandas dataframe.
import pandas as pd
# Dummy data
df = pd.DataFrame({'Product': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'],
'Year': [2015, 2016, 2017, 2018, 2015, 2016, 2017, 2018],
'IB': [2, 5, 8, 10, 7, 5, 10, 14],
'OB': [5, 8, 10, 12, 5, 10, 14, 20],
'Delta': [2, 2, 1, 3, -1, 3, 2, 4]})
# The function to be applied
def get_ln_rate(ib, ob, delta):
n_years = len(ib)
return sum(delta)*np.log(ob[-1]/ib[0]) / (n_years * (ob[-1] - ib[0]))
The expected output is
Product Year IB OB Delta Ln_Rate
0 A 2015 2 5 2
1 A 2016 5 8 2
2 A 2017 8 10 1 0.3353
3 A 2018 10 12 3 0.2501
4 B 2015 7 5 -1
5 B 2016 5 10 3
6 B 2017 10 14 2 0.1320
7 B 2018 14 20 4 0.2773
I have tried
df['Ln_Rate'] = df.groupby('Product').rolling(3).apply(lambda x: get_ln_rate(x['IB'], x['OB'], x['Delta']))
But this does not work.
I have found several similar posts
applying custom rolling function to dataframe - this one does not have a clear answer
Pandas Rolling Apply custom - this one does not have multiple arguments
apply custom function on pandas dataframe on a rolling window - this one has rolling.apply... but it doesn't show the syntax.
Neither seems to be spot on. Any pointers towards the correct syntax would be greatly appreciated.
Another answer came up my mind: Create rolling windows on the grouped indices, and pass partial dfs to your custom function. Of course, the function is not exactly called with multiple arguments, but nevertheless with all data needed.
import numpy as np
import pandas as pd
df = pd.DataFrame({'Product': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'],
'Year': [2015, 2016, 2017, 2018, 2015, 2016, 2017, 2018],
'IB': [2, 5, 8, 10, 7, 5, 10, 14],
'OB': [5, 8, 10, 12, 5, 10, 14, 20],
'Delta': [2, 2, 1, 3, -1, 3, 2, 4]})
# The function to be applied
def get_ln_rate(df):
n_years = len(df['IB'])
return df['Delta'].sum() * np.log(df['OB'].iloc[-1] / df['IB'].iloc[0]) / (n_years * (df['OB'].iloc[-1] - df['IB'].iloc[0]))
ln_rate = df.groupby('Product').apply(lambda grp: pd.Series(grp.index).rolling(3).agg({'Ln_Rate': lambda window: get_ln_rate(grp.loc[window])})).reset_index()['Ln_Rate']
df.assign(Ln_Rate=ln_rate)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With