Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R data.table "by" for "i"

Tags:

r

data.table

In a given data.table what would be the syntax to select column V1 where V2 = max(V2), grouped by V3.

For example: In the mtcars dataset, I would like to find out what is the hp that corresponds to the observation equal to max(disp), grouped by cyl

Here is my ungraceful solution, using which:

mtcars <- data.table(mtcars)
mtcars[which(mtcars$disp %in% mtcars[, max(disp), by = .(cyl)]$V1), .(cyl,hp)]
   cyl  hp
1:   6 110
2:   4  62
3:   8 205

Is there a more "data.table" way of achieving the same result?

like image 792
Ivan P. Avatar asked Dec 05 '22 18:12

Ivan P.


2 Answers

We can try the join

mtcars[mtcars[, list(disp=max(disp)), by = cyl], 
    on = c('cyl', 'disp')][, c('cyl', 'hp'), with=FALSE]
#   cyl  hp
#1:   6 110
#2:   4  62
#3:   8 205

Or here is a shorter version to get the expected output.

mtcars[, .SD[disp==max(disp), .(hp)], by = cyl]
#   cyl  hp
#1:   6 110
#2:   4  62
#3:   8 205
like image 75
akrun Avatar answered Dec 28 '22 12:12

akrun


You could use .I:

mtcars[mtcars[, .I[which.max(disp)], by = cyl]$V1, .(cyl, hp)]
#   cyl  hp
#1:   6 110
#2:   4  62
#3:   8 205

Or

mtcars[, hp[disp == max(disp)], by=cyl]
#   cyl  V1
#1:   6 110
#2:   4  62
#3:   8 205

And your own approach could be slightly shortened to:

mtcars[disp %in% mtcars[, max(disp), by = .(cyl)]$V1, .(cyl,hp)]
like image 31
talat Avatar answered Dec 28 '22 14:12

talat