Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use a value from the previous row in an R data.table calculation

Tags:

r

data.table

I want to create a new column in a data.table calculated from the current value of one column and the previous of another. Is it possible to access previous rows?

E.g.:

> DT <- data.table(A=1:5, B=1:5*10, C=1:5*100) > DT    A  B   C 1: 1 10 100 2: 2 20 200 3: 3 30 300 4: 4 40 400 5: 5 50 500 > DT[, D := C + BPreviousRow] # What is the correct code here? 

The correct answer should be

> DT    A  B   C   D 1: 1 10 100  NA 2: 2 20 200 210 3: 3 30 300 320 4: 4 40 400 430 5: 5 50 500 540 
like image 637
Corvus Avatar asked Feb 04 '13 14:02

Corvus


1 Answers

With shift() implemented in v1.9.6, this is quite straightforward.

DT[ , D := C + shift(B, 1L, type="lag")] # or equivalently, in this case, DT[ , D := C + shift(B)] 

From NEWS:

  1. New function shift() implements fast lead/lag of vector, list, data.frames or data.tables. It takes a type argument which can be either "lag" (default) or "lead". It enables very convenient usage along with := or set(). For example: DT[, (cols) := shift(.SD, 1L), by=id]. Please have a look at ?shift for more info.

See history for previous answers.

like image 114
Arun Avatar answered Oct 12 '22 19:10

Arun