Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R data.table setting values

Tags:

r

data.table

I'm trying to set the following data.tables px & vol using the following code. (falling back to a slow for loop)

a=data.table(
  date_id = rep(seq(as.Date('2013-01-01'),as.Date('2013-04-10'),'days'),5),
  px =rnorm(500,mean=50,sd=5),
  vol=rnorm(500,mean=500000,sd=150000),
  id=rep(letters[1:5],each=100)
  )

b=data.table(
  date_id=rep(seq(as.Date('2013-01-01'),length.out=600,by='days'),5),
  id=rep(letters[1:5],each=600),
  px=NA_real_,
  vol=NA_real_
  )

setkeyv(a,c('date_id','id'))
setkeyv(b,c('date_id','id'))

and the following approach doesn't work.

s = a[1,id]
d = a[1,date_id]
b[id == s & date_id == d, list(names(b)[3:4])] <- a[id == s & date_id ==d, list(names(a)[2:3])]

It fails with the following code

Error in `[<-.data.table`(`*tmp*`, id == s & date_id == d, list(names(b)[3:4]),  : 
  j must be atomic vector, see ?is.atomic

What am I doing wrong and how do I set those values from one data.table to the other elementwise. The actual table has quite a few columns so writing them out by hand is not an option for me.

Thanks

like image 575
Tahnoon Pasha Avatar asked Oct 18 '25 07:10

Tahnoon Pasha


1 Answers

There are multiple issues in your example.

First , if you want to access columns in a data.table in the form dt[ , "col" ] you have to add with=FALSE:

b[ , names(b)[3:4], with = FALSE ]

Second, I am not sure if assigning values in a data.table is possible at all using the assignment operator (<-). For this purpose there is the ultra fast update-by-reference operator:

b[
  id == s & date_id == d,
  names(b)[3:4] := a[id == s & date_id ==d, names(a)[2:3], with = FALSE],
  with = FALSE
]

Third, subsetting data.tables by dt[ col == value, ] syntax is possible but slow. Especially if you have already keys set on the columns you want to subset by, you should use the following syntax:

b[
  J(d,s),
  names(b)[3:4] := a[J(d,s), names(a)[2:3], with = FALSE] ,
  with = FALSE
]

Fourth, this all looks to me as if you want a simple join of two tables. So the most straight forward would be

a[ b[ , list(date_id, id) ] ]

Or considering your comment, that you only want to overwrite the columns pxand vol in the subset by a:

b[a, c("px", "vol") := a[, list(px, vol)], with = FALSE ]
like image 183
Beasterfield Avatar answered Oct 19 '25 22:10

Beasterfield



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!