I want to use a long vector of numbers, to create a matrix where each column is a successive offset (lag or lead) of the original vector. If n
is the maximum offset, the matrix will have dimensions [length(vector), n * 2 + 1]
(because we want offsets in both directions, and include the 0 offset, i.e. the original vector).
To illustrate, consider the following vector:
test <- c(2, 8, 1, 10, 7, 5, 9, 3, 4, 6)
[1] 2 8 1 10 7 5 9 3 4 6
Now we create offsets of values, let's say for n == 3
:
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] NA NA NA 2 8 1 10
[2,] NA NA 2 8 1 10 7
[3,] NA 2 8 1 10 7 5
[4,] 2 8 1 10 7 5 9
[5,] 8 1 10 7 5 9 3
[6,] 1 10 7 5 9 3 4
[7,] 10 7 5 9 3 4 6
[8,] 7 5 9 3 4 6 NA
[9,] 5 9 3 4 6 NA NA
[10,] 9 3 4 6 NA NA NA
I am looking for an efficient solution. data.table
or tidyverse
solutions more than welcome.
Returning only the rows that have no NA
's (i.e. rows 4 to 7) is also ok.
lags <- lapply(3:1, function(x) dplyr::lag(test, x))
leads <- lapply(1:3, function(x) dplyr::lead(test, x))
l <- c(lags, test, leads)
matrix(unlist(l), nrow = length(test))
In base R, you can use embed
to get rows 4 through 7. You have to reverse the column order, however.
embed(test, 7)[, 7:1]
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 2 8 1 10 7 5 9
[2,] 8 1 10 7 5 9 3
[3,] 1 10 7 5 9 3 4
[4,] 10 7 5 9 3 4 6
data
test <- c(2, 8, 1, 10, 7, 5, 9, 3, 4, 6)
This will produce what you need...
n <- 3
t(embed(c(rep(NA,n), test, rep(NA,n)), length(test)))[length(test):1,]
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] NA NA NA 2 8 1 10
[2,] NA NA 2 8 1 10 7
[3,] NA 2 8 1 10 7 5
[4,] 2 8 1 10 7 5 9
[5,] 8 1 10 7 5 9 3
[6,] 1 10 7 5 9 3 4
[7,] 10 7 5 9 3 4 6
[8,] 7 5 9 3 4 6 NA
[9,] 5 9 3 4 6 NA NA
[10,] 9 3 4 6 NA NA NA
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With