In a given data.table what would be the syntax to select column V1 where V2 = max(V2), grouped by V3.
For example: In the mtcars dataset, I would like to find out what is the hp that corresponds to the observation equal to max(disp), grouped by cyl
Here is my ungraceful solution, using which:
mtcars <- data.table(mtcars)
mtcars[which(mtcars$disp %in% mtcars[, max(disp), by = .(cyl)]$V1), .(cyl,hp)]
cyl hp
1: 6 110
2: 4 62
3: 8 205
Is there a more "data.table" way of achieving the same result?
We can try the join
mtcars[mtcars[, list(disp=max(disp)), by = cyl],
on = c('cyl', 'disp')][, c('cyl', 'hp'), with=FALSE]
# cyl hp
#1: 6 110
#2: 4 62
#3: 8 205
Or here is a shorter version to get the expected output.
mtcars[, .SD[disp==max(disp), .(hp)], by = cyl]
# cyl hp
#1: 6 110
#2: 4 62
#3: 8 205
You could use .I
:
mtcars[mtcars[, .I[which.max(disp)], by = cyl]$V1, .(cyl, hp)]
# cyl hp
#1: 6 110
#2: 4 62
#3: 8 205
Or
mtcars[, hp[disp == max(disp)], by=cyl]
# cyl V1
#1: 6 110
#2: 4 62
#3: 8 205
And your own approach could be slightly shortened to:
mtcars[disp %in% mtcars[, max(disp), by = .(cyl)]$V1, .(cyl,hp)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With