Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apply rolling function on pandas dataframe with multiple arguments

I am trying to apply a rolling function, with a 3 year window, on a pandas dataframe.

import pandas as pd

# Dummy data
df = pd.DataFrame({'Product': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'],
                   'Year': [2015, 2016, 2017, 2018, 2015, 2016, 2017, 2018],
                   'IB': [2, 5, 8, 10, 7, 5, 10, 14],
                   'OB': [5, 8, 10, 12, 5, 10, 14, 20],
                   'Delta': [2, 2, 1, 3, -1, 3, 2, 4]})

# The function to be applied
def get_ln_rate(ib, ob, delta):
    n_years = len(ib)
    return sum(delta)*np.log(ob[-1]/ib[0]) / (n_years * (ob[-1] - ib[0]))

The expected output is

  Product  Year  IB  OB  Delta  Ln_Rate
0       A  2015   2   5      2     
1       A  2016   5   8      2    
2       A  2017   8  10      1   0.3353
3       A  2018  10  12      3   0.2501
4       B  2015   7   5     -1  
5       B  2016   5  10      3
6       B  2017  10  14      2   0.1320
7       B  2018  14  20      4   0.2773

I have tried

df['Ln_Rate'] = df.groupby('Product').rolling(3).apply(lambda x: get_ln_rate(x['IB'], x['OB'], x['Delta']))

But this does not work.

I have found several similar posts

applying custom rolling function to dataframe - this one does not have a clear answer

Pandas Rolling Apply custom - this one does not have multiple arguments

apply custom function on pandas dataframe on a rolling window - this one has rolling.apply... but it doesn't show the syntax.

Neither seems to be spot on. Any pointers towards the correct syntax would be greatly appreciated.

like image 862
mortysporty Avatar asked Nov 23 '25 02:11

mortysporty


1 Answers

Another answer came up my mind: Create rolling windows on the grouped indices, and pass partial dfs to your custom function. Of course, the function is not exactly called with multiple arguments, but nevertheless with all data needed.

import numpy as np
import pandas as pd

df = pd.DataFrame({'Product': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'],
                   'Year': [2015, 2016, 2017, 2018, 2015, 2016, 2017, 2018],
                   'IB': [2, 5, 8, 10, 7, 5, 10, 14],
                   'OB': [5, 8, 10, 12, 5, 10, 14, 20],
                   'Delta': [2, 2, 1, 3, -1, 3, 2, 4]})

# The function to be applied
def get_ln_rate(df):
    n_years = len(df['IB'])
    return df['Delta'].sum() * np.log(df['OB'].iloc[-1] / df['IB'].iloc[0]) / (n_years * (df['OB'].iloc[-1] - df['IB'].iloc[0]))

ln_rate = df.groupby('Product').apply(lambda grp: pd.Series(grp.index).rolling(3).agg({'Ln_Rate': lambda window: get_ln_rate(grp.loc[window])})).reset_index()['Ln_Rate']
df.assign(Ln_Rate=ln_rate)
like image 86
Markus Rother Avatar answered Nov 24 '25 16:11

Markus Rother



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!