df <- data.frame(
id = c('A1','A2','A4','A2','A1','A4','A3','A2','A1','A3'),
value = c(4,3,1,3,4,6,6,1,8,4))
I want to get max value within each id group. I tried following but got an error saying replacement has 4 rows and data has 10 which i understand but don't know how to correct
df$max.by.id <- aggregate(value ~ id, df, max)
this is how i ended up successfully doing it
max.by.id <- aggregate(value ~ id, df, max)
names(max.by.id) <- c("id", "max")
df2 <- merge(df,max.by.id, by.x = "id", by.y = "id")
df2
# id value max
#1 A1 4 8
#2 A1 4 8
#3 A1 8 8
#4 A2 3 3
#5 A2 3 3
#6 A2 1 3
#7 A3 6 6
#8 A3 4 6
#9 A4 1 6
#10 A4 6 6
any better way? thanks in advance
ave()
is the function for that task:
df$max.by.id <- ave(df$value, df$id, FUN=max)
example:
df <- data.frame(
id = c('A1','A2','A4','A2','A1','A4','A3','A2','A1','A3'),
value = c(4,3,1,3,4,6,6,1,8,4))
df$max.by.id <- ave(df$value, df$id, FUN=max)
The result of ave()
has the same length as the original vector of values (what is also the length of the grouping variables). The values of the result are going to the right positions with respect to the grouping variables. For more information read the documentation of ave()
.
with data.table
, you can compute the max by id
"inside" the data, automatically adding the newly computed value (unique by id):
library(data.table)
setDT(df)[, max.by.id := max(value), by=id]
df
# id value max.by.id
# 1: A1 4 8
# 2: A2 3 3
# 3: A4 1 6
# 4: A2 3 3
# 5: A1 4 8
# 6: A4 6 6
# 7: A3 6 6
# 8: A2 1 3
# 9: A1 8 8
#10: A3 4 6
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With