I'm new to R and having a heck of a time grappling with the syntax. Let's say I've got the following data frame data:
value label second
1 a q
2 a q
3 a ASDF
4 b q
6 b QWERTY
6 b QWERTY
7 c q
8 c q
9 c q
10 d q
Now, I want to get a vector of df$second which correspond to the maxima of df$value for a given value of df$label. So for instance, given df$label = 'a', I want to return 'ASDF'. For df$label = 'b', I want to return 'QWERTY', 'QWERTY'.
Here's what I'm trying:
max_value <- max(data$value[data$label == 'a'])
result <- c()
for (x in data$value){
if (x == max_value){
result <- c(result, data$second)
}
}
Now this does not generate the proper results vector. I'd like to figure out a way to do this with sapply, tapply, mapply etc. I'm just having trouble getting my head around these functions. Any help would be greatly appreciated.
Straight forward in data.table:
library(data.table)
DT <- data.table(df, key="label")
DT[.(lab)][value==max(value), second]
# where `lab` is whatever label value you are trying to find
Note that if you want to do this for all values of label, just use the by argument:
DT[, c(.SD, mx=max(value)), by=label][value==mx, second, by=label]
label second
1: a ASDF
2: b QWERTY
3: b QWERTY
4: c q
5: d q
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With