Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Retrieve the column name inside the lapply with .SD

Tags:

r

data.table

I would like to apply a function to all column in data.table. Hence, I use .SD with lapply. But, inside lapply I cannot retrieve the column of my table.

For instance

x = data.table(a=1:10, b=10:1, id=1:5)
x[,lapply(.SD,function(t){t*id}),.SDcols=c(1,2)]
Error in ..FUN(a) : object 'id' not found

I do the following:

x[,lapply(.SD,function(t){t*x$id}),.SDcols=c(1,2)]

Can we do better?

like image 354
Nicolas Avatar asked Jul 02 '13 07:07

Nicolas


People also ask

What does .SD mean in data table?

SD stands for "Subset of Data. table". The dot before SD has no significance but doesn't let it clash with a user-defined column name.

What does .SD mean in R?

The Basics: mean() and sd() Calculating an average and standard deviation in R is straightforward. The mean() function calculates the average and the sd() function calculates the standard deviation.


1 Answers

just remove .SDcols=c(1,2). that removes the third column (id)

 > x[,lapply(.SD,function(t){t*id})]
     a  b id
 1:  1 10  1
 2:  4 18  4
 3:  9 24  9
 4: 16 28 16
 5: 25 30 25
 6:  6  5  1
 7: 14  8  4
 8: 24  9  9
 9: 36  8 16
10: 50  5 25

to not have the id, all the following will work:

x[,lapply(.SD[,list(a,b)], `*`, id)]

x[,lapply(.SD[,-3], `*`, id)]

x[,lapply(.SD, `*`,id)][, list(a,b)]
like image 179
Michele Avatar answered Sep 28 '22 13:09

Michele