When I run an old code, I get the following warning: " pandas.stats.ols module is deprecated and will be removed in a future version. We refer to external packages like statsmodels". I could not understand if there is a user-friendly rolling OLS module in statsmodel. What was nice about the pandas.stats.ols module was that you could easily state if an intercept was or not needed, the type of window (rolling, expanding) and the window length. Is there a module that does exactly the same?
For example:
YY = DataFrame(np.log(np.linspace(1,10,10)),columns=['Y'])
XX = DataFrame(np.transpose([np.linspace(1,10,10),np.linspace(2,10,10)]),columns=['XX1','XX2'])
from pandas.stats.ols import MovingOLS
MovingOLS( y=YY['Y'], x=XX, intercept=True, window_type='rolling', window=5).resid
I would like an example of how to get the result of the last line (the residual of the moving ols) using statsmodel or any other module.
Thanks
I created an ols
module designed to mimic pandas' deprecated MovingOLS
; it is here.
It has three core classes:
OLS
: static (single-window) ordinary least-squares regression. The output are NumPy arraysRollingOLS
: rolling (multi-window) ordinary least-squares regression. The output are higher-dimension NumPy arrays.PandasRollingOLS
: wraps the results of RollingOLS
in pandas Series & DataFrames. Designed to mimic the look of the deprecated pandas module.Note that the module is part of a package (which I'm currently in the process of uploading to PyPi) and it requires one inter-package import.
The first two classes above are implemented entirely in NumPy and primarily use matrix algebra. RollingOLS
takes advantage of broadcasting extensively also. Attributes largely mimic statsmodels' OLS RegressionResultsWrapper
.
An example:
# Pull some data from fred.stlouisfed.org
from pandas_datareader.data import DataReader
syms = {'TWEXBMTH' : 'usd',
'T10Y2YM' : 'term_spread',
'PCOPPUSDM' : 'copper'
}
data = (DataReader(syms.keys(), 'fred', start='2000-01-01')
.pct_change()
.dropna())
data = data.rename(columns=syms)
print(data.head())
# usd term_spread copper
# DATE
# 2000-02-01 0.01260 -1.40909 -0.01997
# 2000-03-01 -0.00012 2.00000 -0.03720
# 2000-04-01 0.00564 0.51852 -0.03328
# 2000-05-01 0.02204 -0.09756 0.06135
# 2000-06-01 -0.01012 0.02703 -0.01850
# Rolling regressions
from pyfinance.ols import OLS, RollingOLS, PandasRollingOLS
y = data.usd
x = data.drop('usd', axis=1)
window = 12 # months
model = PandasRollingOLS(y=y, x=x, window=window)
# Here `.resids` will be a stacked, MultiIndex'd DataFrame. Each outer
# index is a "period ending" and each inner index block are the
# subperiods for that rolling window.
print(model.resids)
# end subperiod
# 2001-01-01 2000-02-01 0.00834
# 2000-03-01 -0.00375
# 2000-04-01 0.00194
# 2000-05-01 0.01312
# 2000-06-01 -0.01460
# 2000-07-01 -0.00462
# 2000-08-01 -0.00032
# 2000-09-01 0.00299
# 2000-10-01 0.01103
# 2000-11-01 0.00556
# 2000-12-01 -0.01544
# 2001-01-01 -0.00425
# 2017-06-01 2016-07-01 0.01098
# 2016-08-01 -0.00725
# 2016-09-01 0.00447
# 2016-10-01 0.00422
# 2016-11-01 -0.00213
# 2016-12-01 0.00558
# 2017-01-01 0.00166
# 2017-02-01 -0.01554
# 2017-03-01 -0.00021
# 2017-04-01 0.00057
# 2017-05-01 0.00085
# 2017-06-01 -0.00320
# Name: resids, dtype: float64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With