R Subset data.frame from max value of one vector and grouped by another [duplicate]

Question

>ID<-c('A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C', 'C', 'C')
>WK<-c(1, 2, 3, 1, 2, 3, 1, 2, 3, 4, 5)
>NumSuccess<-c(0, 0, 2, 0, 0, 1, 0, 0, 0, 0, 3)
>Data<-data.frame(ID, WK, NumSuccess)

I am trying to create a subset data.frame "Data2" based on the value in "NumSuccesses" that corresponds to the Max Value in "WK" grouped by "ID". Resulting data.frame should look like this:

>ID<-c('A','B','C')
>WK<-c(3, 3, 5)
>NumSuccess<-c(2, 1, 3)
>Data2<-data.frame(ID, WK, NumSuccess)

akrun · Accepted Answer

This could be done in more than one way. If there are ties for 'WK', maximum value per each 'ID' and want all the rows with the maximum 'WK', it may be useful to filter with the logical condition (WK==max(WK)) after grouping by the 'ID'.

library(dplyr)
Data %>% 
      group_by(ID) %>% 
      filter(WK==max(WK))
#   ID WK NumSuccess
#1  A  3          2
#2  B  3          1
#3  C  5          3

If there is a single 'max' value for 'WK' per 'ID', we can use which.max or use arrange to order the dataset by 'WK' for each 'ID'

  Data %>% 
       group_by(ID) %>%
       slice(which.max(WK))
  #     ID WK NumSuccess
  #1  A  3          2
  #2  B  3          1
  #3  C  5          3

 Data %>%
       group_by(ID) %>%
       arrange(-WK) %>% 
       slice(1)

Similar approaches in data.table are

library(data.table)
setDT(Data)[, .SD[max(WK)==WK], ID]
setDT(Data)[, .SD[which.max(WK)], ID]
setkey(setDT(Data), WK)[, .SD[.N], ID]

Or we can use ave from base R

 Data[with(Data, ave(WK, ID, FUN=max)==WK),]

R Subset data.frame from max value of one vector and grouped by another [duplicate]

Tags:

r

subset

user3594490

1 Answers

akrun

Recent Activity

Donate For Us

R Subset data.frame from max value of one vector and grouped by another [duplicate]

Tags:

r

subset

user3594490

1 Answers

akrun

Related questions

Recent Activity

Donate For Us