Select highest values in a dataframe by group

Question

I have the following df

dat <- data.frame(Cases = c("Student3","Student3","Student3","Student1","Student1",
"Student2","Student2","Student2","Student4"), Class = rep("Math", 9),
Scores = c(9,5,2,7,3,8,5,1,7), stringsAsFactors = F)


> dat
   Cases    Class   Scores
1 Student3  Math      9
2 Student3  Math      5
3 Student3  Math      2
4 Student1  Math      7
5 Student1  Math      3
6 Student2  Math      8
7 Student2  Math      5
8 Student2  Math      1
9 Student4  Math      7

On the other hand, I have another df with the following information:

d <- data.frame(Cases = c("Student3", "Student1",
"Student2", "Student4"), Class = rep("Math", 4), stringsAsFactors = F)

    Cases  Class
1 Student3  Math
2 Student1  Math
3 Student2  Math
4 Student4  Math

With these two, I want to extract the highest scores for each student. So my output would look like this:

> dat_output
    Cases  Class   Scores
1 Student3  Math      9
2 Student1  Math      7
3 Student2  Math      8
4 Student4  Math      7

I tried with merge but it is not extracting just the highest scores.

Ronak Shah · Accepted Answer

We can use sapply on each Cases in d, subset the dat for that Cases and get the max score for it.

sapply(d$Cases, function(x) max(dat$Scores[dat$Cases %in% x]))

#Student3 Student1 Student2 Student4 
#       9        7        8        7

To get the result as data.frame

transform(d, Scores = sapply(d$Cases, function(x) 
                     max(dat$Scores[dat$Cases %in% x])))

#    Cases Class Scores
# Student3  Math      9 
# Student1  Math      7
# Student2  Math      8
# Student4  Math      7

Note - I have assumed your d to be

d <- data.frame(Cases = c("Student3", "Student1",
      "Student2", "Student4"), Class = rep("Math", 4), stringsAsFactors = F)

Lennyy · Answer

If I am correct you don't need d, since in d there is no additional information that is not in dat already.

You can just do:

dat_output <- aggregate(Scores ~ Cases, dat, max)
dat_output

     Cases Scores
1 Student1      7
2 Student2      8
3 Student3      9
4 Student4      7

Select highest values in a dataframe by group

Tags:

dataframe

r

Cahidora

Video Answer

2 Answers

Ronak Shah

Lennyy

Recent Activity

Donate For Us

Select highest values in a dataframe by group

Tags:

dataframe

r

Cahidora

Video Answer

2 Answers

Ronak Shah

Lennyy

Related questions

Recent Activity

Donate For Us