Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

data.table lag operator throwing error

Tags:

r

data.table

Hi I am trying to create a data.table with lagged variables by group id. Certain id's have only 1 row in the data.table in that case the shift operator for lag gives error but the lead operator works fine. Here is an example

dt = data.table(id = 1, week = as.Date('2014-11-11'), sales = 1)
lead = 2
lag = 2
lagSalesNames = paste('lag_sales_', 1:lag, sep = '')
dt[,(lagSalesNames) := shift(sales, 1:lag, NA, 'lag'), by = list(id)]

This gives me the following error

All items in j=list(...) should be atomic vectors or lists. If you are trying something like j=list(.SD,newcol=mean(colA)) then use := by group instead
 (much quicker), or cbind or merge afterwards.

But if I try the same thing with lead instead, it works fine

dt[,(lagSalesNames) := shift(sales, 1:lag, NA, 'lead'), by = list(id)]

It also seem to work fine if the data.table has more than 1 row e.g. you can try the following with 2 rows which works fine

dt = data.table(id = 1, week = as.Date(c('2014-11-11', '2014-11-11')), sales = 1:2)
dt[,(lagSalesNames) := shift(sales, 1:lag, NA, 'lag'), by = list(id)]

I am using data.table version 1.9.5 on a linux machine with R version 3.1.0. Any help would be much appreciated.

Thanks, Ashin

like image 473
Ashin Mukherjee Avatar asked May 06 '26 10:05

Ashin Mukherjee


1 Answers

Thanks for the report. This is now fixed (issue #1014) with commit #1722 in data.table v1.9.5.

Now works as intended:

dt
#    id       week sales lag_sales_1 lag_sales_2
# 1:  1 2014-11-11     1          NA          NA
like image 92
Arun Avatar answered May 07 '26 23:05

Arun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!