I have a dataframe like this:
Date Y X1 X2 X3
22 2004-05-12 9.348158e-09 0.000081 0.000028 0.000036
23 2004-05-13 9.285989e-09 0.000073 0.000081 0.000097
24 2004-05-14 9.732308e-09 0.000085 0.000073 0.000096
25 2004-05-17 2.235977e-08 0.000089 0.000085 0.000099
26 2004-05-18 2.792661e-09 0.000034 0.000089 0.000150
27 2004-05-19 9.745323e-09 0.000048 0.000034 0.000053
......
1000 2004-05-20 1.835462e-09 0.000034 0.000048 0.000099
1001 2004-05-21 3.529089e-09 0.000037 0.000034 0.000043
1002 2004-05-24 3.453047e-09 0.000043 0.000037 0.000059
1003 2004-05-25 2.963131e-09 0.000038 0.000043 0.000059
1004 2004-05-26 1.390032e-09 0.000029 0.000038 0.000054
I want to run a rolling 100-day window OLS regression estimation, which is:
First for the 101st row, I run a regression of Y-X1,X2,X3 using the 1st to 100th rows, and estimate Y for the 101st row;
Then for the 102nd row, I run a regression of Y-X1,X2,X3 using the 2nd to 101st rows, and estimate Y for the 102nd row;
Then for the 103rd row, I run a regression of Y-X1,X2,X3 using the 2nd to 101st rows, and estimate Y for the 103rd row;
......
Until the last row.
How to do this?
model = pd.stats.ols.MovingOLS(y=df.Y, x=df[['X1', 'X2', 'X3']],
window_type='rolling', window=100, intercept=True)
df['Y_hat'] = model.y_predict
statsmodels 0.11.0 added RollingOLS (Jan2020)
from statsmodels.regression.rolling import RollingOLS
#add constant column to regress with intercept
df['const'] = 1
#fit
model = RollingOLS(endog =df['Y'].values , exog=df[['const','X1','X2','X3']],window=20)
rres = model.fit()
rres.params.tail() #look at last few intercept and coef
Or use R-style regression formula
model = RollingOLS.from_formula('Y ~ X1 + X2 + X3' , data = df, window=20)
rres = model.fit()
rres.params.tail()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With