I have a dataframe with some numbers(score) and repeating ID. I want to get the maximum value for each of the ID. I used this function
top = aggregate(df$score, list(df$ID),max)
This returned me a top dataframe with maximum values corresponding to each ID.
But it so happens that for one of the ID, we have two EQUAL max value. But this function is ignoring the second value.
Is there any way to retain BOTH the max values.?
For Example:
df
ID score
1 12
1 15
1 1
1 15
2 23
2 12
2 13
The above function gives me this: top
ID Score
1 15
2 23
I need this: top
ID Score
1 15
1 15
2 23
I recommend data.table as Chris mentioned (good for speed, but steeper learning curve).
Or if you don't want data.table you could use plyr:
library(plyr)
ddply(df, .(ID), subset, score==max(score))
# same as ddply(df, .(ID), function (x) subset(x, score==max(score)))
You can convert to a data.table:
DT <- as.data.table(df)
DT[, .SD[score == max(score)], by=ID]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With