I have a problem dealing with time series in R.
#--------------read data
wb = loadWorkbook("Countries_Europe_Prices.xlsx")
df = readWorksheet(wb, sheet="Sheet2")
x <- df$Year
y <- df$Index1
y <- lag(y, 1, na.pad = TRUE)
cbind(x, y)
It gives me the following output:
x y
[1,] 1974 NA
[2,] 1975 50.8
[3,] 1976 51.9
[4,] 1977 54.8
[5,] 1978 58.8
[6,] 1979 64.0
[7,] 1980 68.8
[8,] 1981 73.6
[9,] 1982 74.3
[10,] 1983 74.5
[11,] 1984 72.9
[12,] 1985 72.1
[13,] 1986 72.3
[14,] 1987 71.7
[15,] 1988 72.9
[16,] 1989 75.3
[17,] 1990 81.2
[18,] 1991 84.3
[19,] 1992 87.2
[20,] 1993 90.1
But I want the first value in y to be 50.8 and so forth. In other words, I want to get a negative lag. I don't get it, how can I do it?
My problem is very similar to this problem, but however I cannot solve it. I guess I still do not understand the solution(s)...
Basic lag in R vector/dataframe
The opposite of lag() function is lead()
lead: lead Window function: returns the value that is offset rows after the current row, and defaultValue if there is less than offset rows after the current row. For example, an offset of one will return the next row at any given point in the window partition.
lag lag shifts the times one back. It does not change the values, only the times. Thus lag changes the tsp attribute from c(1, 4, 1) to c(0, 3, 1) . The start time is shifted from 1 to 0, the end time is shifted from 4 to 3 and since shifts do not change the frequency the frequency remains 1.
How about the built-in 'lead' function? (from the dplyr package) Doesn't it do exactly the job of Ahmed's function?
cbind(x, lead(y, 1))
If you want to be able to calculate either positive or negative lags in the same function, i suggest a 'shorter' version of his 'shift' function:
shift = function(x, lag) {
require(dplyr)
switch(sign(lag)/2+1.5, lead(x, abs(lag)), lag(x, abs(lag)))
}
What it does is creating 2 cases, one with lag the other with lead, and chooses one case depending on the sign of your lag (the +1.5 is a trick to transform a {-1, +1} into a {1, 2} alternative).
There is an easier way of doing this which I have captured fully from this link. What I will do here is explaining what should you do in steps:
First create the following function by running the following code:
shift<-function(x,shift_by){
stopifnot(is.numeric(shift_by))
stopifnot(is.numeric(x))
if (length(shift_by)>1)
return(sapply(shift_by,shift, x=x))
out<-NULL
abs_shift_by=abs(shift_by)
if (shift_by > 0 )
out<-c(tail(x,-abs_shift_by),rep(NA,abs_shift_by))
else if (shift_by < 0 )
out<-c(rep(NA,abs_shift_by), head(x,-abs_shift_by))
else
out<-x
out
}
This will create a function called shift
with two arguments; one is the vector you need to operate its lag/lead and the other is number of lags/leads you need.
Example:
Suppose you have the following vector:
x<-seq(1:10)
x
[1] 1 2 3 4 5 6 7 8 9 10
if you need x
's first order lag
shift(x,-1)
[1] NA 1 2 3 4 5 6 7 8 9
if you need x
's first order lead (negative lag)
shift(x,1)
[1] 2 3 4 5 6 7 8 9 10 NA
Simpler solution:
y = dplyr::lead(y,1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With