Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I replace aggregate by data.table?

I've got the following data:

> dput(bla)
structure(list(V1 = structure(c(4L, 4L, 4L, 2L), .Label = c("DDDD", 
"EEEE", "NNNN", "PPPP", "ZZZZ"), class = "factor"), V2 = c(100014096L, 
100014098L, 100014099L, 100014995L), V3 = c(0.742, 0.779, 0.744, 
0.42), V4 = c(1.077, 1.054, 1.049, 0.984), V5 = c(0.662, 0.663, 
0.671, 0.487), V6 = c(1.107, 1.14, 1.11, 0.849), V7 = c(0.456, 
0.459, 0.459, 1.278)), .Names = c("V1", "V2", "V3", "V4", "V5", 
"V6", "V7"), class = "data.frame", row.names = c(NA, 4L))

> bla
    V1        V2    V3    V4    V5    V6    V7
1 PPPP 100014096 0.742 1.077 0.662 1.107 0.456
2 PPPP 100014098 0.779 1.054 0.663 1.140 0.459
3 PPPP 100014099 0.744 1.049 0.671 1.110 0.459
4 EEEE 100014995 0.420 0.984 0.487 0.849 1.278

I want to perform a function to achieve the following, currently I'm using aggregate:

> linem<- aggregate(bla[,3:7], list(line=bla$V1),mean, na.rm=T) 
> linem
  line    V3    V4        V5    V6    V7
1 EEEE 0.420 0.984 0.4870000 0.849 1.278
2 PPPP 0.755 1.060 0.6653333 1.119 0.458

To improve the performance of this script I've been trying to get a grasp of data.table to do this. How can I use data.table to get the above output?
I have been trying to do it with data.table, however if there is faster methods like in this question it would be good too.

like image 834
Bas Avatar asked Dec 20 '25 16:12

Bas


2 Answers

We convert the 'data.frame' to 'data.table' (setDT(bla)), group by 'V1', specify the columns in the .SDcols, loop over the columns (lapply(.SD,) and get the mean.

 library(data.table)
 setDT(bla)[, lapply(.SD, mean), by = V1, .SDcols= 3:ncol(bla)]
 #     V1    V3    V4        V5    V6    V7
 #1: PPPP 0.755 1.060 0.6653333 1.119 0.458
 #2: EEEE 0.420 0.984 0.4870000 0.849 1.278
like image 182
akrun Avatar answered Dec 22 '25 07:12

akrun


A "vectorized" option without lapply but colMeans on .SD:

library(data.table)

setDT(bla)[,as.list(colMeans(.SD[ , 2:6])), V1]
#     V1    V3    V4        V5    V6    V7
#1: PPPP 0.755 1.060 0.6653333 1.119 0.458
#2: EEEE 0.420 0.984 0.4870000 0.849 1.278
like image 36
Colonel Beauvel Avatar answered Dec 22 '25 09:12

Colonel Beauvel



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!