Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find the Maximum Value with respect to another within two data frames (VLOOKUP which returns Max Value) in R

Tags:

dataframe

r

dplyr

I am trying to find the do a function which is similar to a vlookup in excel but which returns the maximum value and the other values in the same row. The data frame looks like this:

enter image description here enter image description here

The data frame which I am dealing with are given below:

dput(Book3)
structure(list(Item = c("ABA", "ABB", "ABC", "ABD", "ABE", "ABF"
)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-6L))

dput(Book4)
structure(list(Item = c("ABA", "ABB", "ABC", "ABD", "ABE", "ABF", 
"ABA", "ABB", "ABC", "ABD", "ABE", "ABF", "ABA", "ABB", "ABC", 
"ABD", "ABE", "ABF"), Max1 = c(12, 68, 27, 17, 74, 76, 78, 93, 
94, 98, 46, 90, 5, 58, 67, 64, 34, 97), Additional1 = c(40, 66, 
100, 33, 66, 19, 8, 70, 21, 93, 48, 34, 44, 89, 74, 20, 0, 47
), Additional2 = c(39, 31, 85, 58, 0, 2, 57, 28, 31, 32, 15, 
22, 93, 41, 57, 81, 95, 46)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -18L))

The Expected output for this is given below:

enter image description here

like image 945
user20203146 Avatar asked Nov 18 '25 15:11

user20203146


2 Answers

You are looking for slice_max:

library(dplyr)
Book4 %>% 
  group_by(Item) %>% 
  slice_max(Max1)

#   Item   Max1 Additional1 Additional2
# 1 ABA      78           8          57
# 2 ABB      93          70          28
# 3 ABC      94          21          31
# 4 ABD      98          93          32
# 5 ABE      74          66           0
# 6 ABF      97          47          46
like image 163
Maël Avatar answered Nov 21 '25 07:11

Maël


An alternative base solution that is more resilient to floating-point precision problems (c.f., Why are these numbers not equal?, https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f). It also allows two behavior options if there are duplicate max-values:

  1. if you want all of them, use ties.method = "min";
  2. if you want the first (or just one) of them, then ties.method = "first".
Book4[ave(Book4$Max1, Book4$Item, FUN = function(z) rank(-z, ties.method = "first")) == 1,]
# # A tibble: 6 x 4
#   Item   Max1 Additional1 Additional2
#   <chr> <dbl>       <dbl>       <dbl>
# 1 ABE      74          66           0
# 2 ABA      78           8          57
# 3 ABB      93          70          28
# 4 ABC      94          21          31
# 5 ABD      98          93          32
# 6 ABF      97          47          46
like image 34
r2evans Avatar answered Nov 21 '25 06:11

r2evans