Getting at the previous n-rows in a data frame?

Tags:

plyr

I have the following data frame.

date id value
2012-01-01 1 0.3
2012-01-01 2 0.5
2012-01-01 3 0.2
2012-01-01 4 0.8
2012-01-01 5 0.2
2012-01-01 6 0.8
2012-01-01 7 0.1
2012-01-01 8 0.4
2012-01-01 9 0.3
2012-01-01 10 0.2

There are several dates and for each date, I have 10 id values as shown above and a value field. What I would like to do is for every id find the previous n values in the "value" field. For example if n = 3 then I want the output to be as follows.

date id value value1 value2 value3
2012-01-01 1 0.3 NA NA NA
2012-01-01 2 0.5 NA NA NA
2012-01-01 3 0.2 NA NA NA
2012-01-01 4 0.8 0.2 0.5 0.3
2012-01-01 5 0.2 0.8 0.2 0.5
...

Is there an easy way to get to this either through plyr or using mapply? Thanks much in advance.

764

asked May 29 '12 06:05

broccoli

2 Answers

You can do this quite easily using base functions:

id <- 1:10
value <- c(0.3,0.5,0.2,0.8,0.2,0.8,0.1,0.4,0.3,0.2)
test <- data.frame(id,value)

test$valprev1 <- c(rep(NA,1),head(test$value,-1))
test$valprev2 <- c(rep(NA,2),head(test$value,-2))
test$valprev3 <- c(rep(NA,3),head(test$value,-3))

Result

   id value valprev1 valprev2 valprev3
1   1   0.3       NA       NA       NA
2   2   0.5      0.3       NA       NA
3   3   0.2      0.5      0.3       NA
4   4   0.8      0.2      0.5      0.3
5   5   0.2      0.8      0.2      0.5
6   6   0.8      0.2      0.8      0.2
7   7   0.1      0.8      0.2      0.8
8   8   0.4      0.1      0.8      0.2
9   9   0.3      0.4      0.1      0.8
10 10   0.2      0.3      0.4      0.1

Made a mistake here previously - here is an sapply version in a function:

prevrows <- function(data,n) {sapply(1:n,function(x) c(rep(NA,x),head(data,-x)))}
prevrows(test$value,3)

Which gives just this:

      [,1] [,2] [,3]
 [1,]   NA   NA   NA
 [2,]  0.3   NA   NA
 [3,]  0.5  0.3   NA
 [4,]  0.2  0.5  0.3
 [5,]  0.8  0.2  0.5
 [6,]  0.2  0.8  0.2
 [7,]  0.8  0.2  0.8
 [8,]  0.1  0.8  0.2
 [9,]  0.4  0.1  0.8
[10,]  0.3  0.4  0.1

You could then apply this to each set of dates in your data like this:

result <- tapply(test$value,test$date,prevrows,3)

Which gives a bunch of lists for each date set. You could rowbind these up for adding back to your data set with:

data.frame(test,do.call(rbind,result))

answered Oct 12 '22 22:10

thelatemail

Using data.table v1.9.5+ this is as simple as:

library(data.table)
setDT(dt)

lags <- dt[, shift(value, n = c(1,2,3))]

or to append them as additional columns in the same data.table:

dt[, c("lag1", "lag2", "lag3") := shift(value, n = c(1,2,3))]

answered Oct 13 '22 00:10

Bar

Related questions
                            
                                Adding values to a matrix using index vectors that include row and column names
                            
                                Data frame column naming
                            
                                Integrate nonparametric curve in R
                            
                                Custom Heat Map in R
                            
                                How to set the Coefficient Value in Regression; R
                            
                                Read tab delimited file with unusual characters, then write an exact copy
                            
                                Cross validation in R
                            
                                Getting table() to return zeroes in R [duplicate]
                            
                                define a function in a specific namespace
                            
                                R - Detecting expressions
                            
                                Relative performance of geom_raster()
                            
                                create an OHLC series from ticker data using R
                            
                                How to index R matrix without it reverting to vector
                            
                                Alignment of numbers on the individual bars with ggplot2
                            
                                R - Produce a string that summarizes a vector of integers by replacing sequential values with the start and end value of the sequence
                            
                                Why does lm return values when there is no variance in the predicted value?
                            
                                Using patterns in addition/instead of background colors in lattice plots
                            
                                How can I change the axis position for pairs()?
                            
                                R: environment lookup
                            
                                Replace non-ascii chars with a defined string list without a loop in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With