Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

data.table index of a second column

Tags:

r

data.table

I have two categorical columns (A,B) and numerical column (C). I want to obtain the value of A where C is the maximum of groups defined by B. I'm looking for a data.table solution.

library(data.table)

dt <- data.table( A = c("a","b","c"), 
                  B = c("d","d","d"), 
                  C = c(1,2,3))
dt
   A B C
1: a d 1
2: b d 2
3: c d 3

# I want to find the value of A for the maximum value
# of C when grouped by B
dt[,max(C), by=c("B")]
   B V1
   1: d  3

#how can I get the A column, value = "c"
like image 368
zach Avatar asked Jan 08 '23 09:01

zach


1 Answers

Another option is to sort by C and just extract the unique B groups. This should be faster for a big data set with many groups because it doesn't calculate maxima per group, rather sorts only once

unique(dt[order(-C)], by = "B")
#    A B C
# 1: c d 3
like image 68
David Arenburg Avatar answered Jan 18 '23 00:01

David Arenburg