calculate summary by group and bring value back in the dataframe [duplicate]

Question

df <- data.frame(
id = c('A1','A2','A4','A2','A1','A4','A3','A2','A1','A3'),
value = c(4,3,1,3,4,6,6,1,8,4))

I want to get max value within each id group. I tried following but got an error saying replacement has 4 rows and data has 10 which i understand but don't know how to correct

df$max.by.id <- aggregate(value ~ id, df, max)

this is how i ended up successfully doing it

max.by.id <- aggregate(value ~ id, df, max)  
names(max.by.id) <- c("id", "max")
df2 <- merge(df,max.by.id, by.x = "id", by.y = "id")
df2
#   id value max
#1  A1     4   8
#2  A1     4   8
#3  A1     8   8
#4  A2     3   3
#5  A2     3   3
#6  A2     1   3
#7  A3     6   6
#8  A3     4   6
#9  A4     1   6
#10 A4     6   6

any better way? thanks in advance

jogo · Accepted Answer

ave() is the function for that task:

df$max.by.id <- ave(df$value, df$id, FUN=max)

example:

df <- data.frame(
  id = c('A1','A2','A4','A2','A1','A4','A3','A2','A1','A3'),
  value = c(4,3,1,3,4,6,6,1,8,4))

df$max.by.id <- ave(df$value, df$id, FUN=max)

The result of ave() has the same length as the original vector of values (what is also the length of the grouping variables). The values of the result are going to the right positions with respect to the grouping variables. For more information read the documentation of ave().

Cath · Answer

with data.table, you can compute the max by id "inside" the data, automatically adding the newly computed value (unique by id):

library(data.table)
setDT(df)[, max.by.id := max(value), by=id]
df
#    id value max.by.id
# 1: A1     4         8
# 2: A2     3         3
# 3: A4     1         6
# 4: A2     3         3
# 5: A1     4         8
# 6: A4     6         6
# 7: A3     6         6
# 8: A2     1         3
# 9: A1     8         8
#10: A3     4         6

calculate summary by group and bring value back in the dataframe [duplicate]

Tags:

r

seakyourpeak

2 Answers

jogo

Cath

Recent Activity

Donate For Us

calculate summary by group and bring value back in the dataframe [duplicate]

Tags:

r

seakyourpeak

2 Answers

jogo

Cath

Related questions

Recent Activity

Donate For Us