Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to Use Lagged Time-Series Variables in a Python Pandas Regression Model?

I'm creating time-series econometric regression models. The data is stored in a Pandas data frame.

How can I do lagged time-series econometric analysis using Python? I have used Eviews in the past (which is a standalone econometric program i.e. not a Python package). To estimate an OLS equation using Eviews you can write something like:

equation eq1.ls log(usales) c log(usales(-1)) log(price(-1)) tv_spend radio_spend

Note the lagged dependent and lagged price terms. It's these lagged variables which seem to be difficult to handle using Python e.g. using scikit or statmodels (unless I've missed something).

Once I've created a model I'd like to perform tests and use the model to forecast.

I'm not interested in doing ARIMA, Exponential Smoothing, or Holt Winters time-series projections - I'm mainly interested in time-series OLS.

like image 716
Steve Maughan Avatar asked Oct 03 '16 21:10

Steve Maughan


People also ask

How do you lag variables in pandas?

Create lag variables, using the shift function. shift(1) creates a lag of a single record, while shift(5) creates a lag of five records. This creates a lag variable based on the prior observations, but shift can also take a time offset to specify the time to use in shift.

What is a lagged variable in regression?

A dependent variable that is lagged in time. For example, if Yt is the dependent variable, then Yt-1 will be a lagged dependent variable with a lag of one period. Lagged values are used in Dynamic Regression modeling.

What are lagged values in time series?

A “lag” is a fixed amount of passing time; One set of observations in a time series is plotted (lagged) against a second, later set of data. The kth lag is the time period that happened “k” time points before time i. For example: Lag1(Y2) = Y1 and Lag4(Y9) = Y5.


1 Answers

pandas allows you to shift your data without moving the index such has

df.shift(-1)

will create a 1 index lag behing

or

df.shift(1)

will create a forward lag of 1 index

so if you have a daily time series, you could use df.shift(1) to create a 1 day lag in you values of price such has

df['lagprice'] = df['price'].shift(1)

after that if you want to do OLS you can look at scipy module here :

http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.stats.linregress.html

like image 184
Steven G Avatar answered Oct 17 '22 07:10

Steven G