Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lagging Variables in R

Tags:

r

time-series

What is the most efficient way to make a matrix of lagged variables in R for an arbitrary variable (i.e. not a regular time series)

For example:

Input:

x <- c(1,2,3,4) 

2 lags, output:

[1,NA, NA]
[2, 1, NA]
[3, 2,  1]
[4, 3,  2]
like image 566
James in Ottawa Avatar asked Aug 21 '09 13:08

James in Ottawa


People also ask

What does lag () do in R?

lag lag shifts the times one back. It does not change the values, only the times. Thus lag changes the tsp attribute from c(1, 4, 1) to c(0, 3, 1) . The start time is shifted from 1 to 0, the end time is shifted from 4 to 3 and since shifts do not change the frequency the frequency remains 1.

How do you make a variable lag?

Create lag variables, using the shift function. shift(1) creates a lag of a single record, while shift(5) creates a lag of five records.

What are lead and lag variables?

Lead and Lag is used to shift one variable ahead or back in time so that the movements of two variables are more closely aligned if there is a time lag between a change in one variable and its impact on another.

What package is lag in in r?

How to Calculate Lag by Group in R?, The dplyr package in R can be used to calculate lagged values by group using the following syntax. The data frame containing the lagged values gains a new variable as a result of the mutate() procedure.


2 Answers

You can achieve this using the built-in embed() function, where its second 'dimension' argument is equivalent to what you've called 'lag':

x <- c(NA,NA,1,2,3,4)
embed(x,3)

## returns
     [,1] [,2] [,3]
[1,]    1   NA   NA
[2,]    2    1   NA
[3,]    3    2    1
[4,]    4    3    2

embed() was discussed in a previous answer by Joshua Reich. (Note that I prepended x with NAs to replicate your desired output).

It's not particularly well-named but it is quite useful and powerful for operations involving sliding windows, such as rolling sums and moving averages.

like image 106
medriscoll Avatar answered Sep 21 '22 22:09

medriscoll


Use a proper class for your objects; base R has ts which has a lag() function to operate on. Note that these ts objects came from a time when 'delta' or 'frequency' where constant: monthly or quarterly data as in macroeconomic series.

For irregular data such as (business-)daily, use the zoo or xts packages which can also deal (very well!) with lags. To go further from there, you can use packages like dynlm or dlm allow for dynamic regression models with lags.

The Task Views on Time Series, Econometrics, Finance all have further pointers.

like image 21
Dirk Eddelbuettel Avatar answered Sep 19 '22 22:09

Dirk Eddelbuettel