I need to find the minimum values of three columns that are bigger than the values in another column. Say these five individuals entered a hospital in different months of the year, and they suffered several heart attacks before and after hospitalization. I need the first heart attack after hospitalization.
id<-c(100,105,108,200,205)
hosp<-c(3,5,2,6,2)
attack1<-c(1,6,3,4,1)
attack2<-c(4,7,9,10,NA)
attack3<-c(5,10,NA,NA,NA)
out<-c(7,12,11,12,9)
data <- data.frame(id,hosp,attack1,attack2,attack3,out)
   id hosp attack1 attack2 attack3 out
1 100    3       1       4       5   7
2 105    5       6       7      10  12
3 108    2       3       9      NA  11
4 200    6       4      10      NA  12
5 205    2       1      NA      NA   9
So the data should end up looking something like
   id hosp attack1 attack2 attack3 out afterh
1 100    3       1       4       5   7      4
2 105    5       6       7      10  12      6
3 108    2       3       9      NA  11      3
4 200    6       4      10      NA  12     10
5 205    2       1      NA      NA   9     NA
This is my attempt which did not work:
min_f<-function(a){
  x<-min(a[a>hosp])
}
data %>% mutate_if(vars(attack1,attack2,attack3),min_f())
                You can use the following solution.
attack
hosp in each row and since you were looking for the first one that is greater than the value of hosp I used first function to extract that..2 also refers to the value of the second variable hosp in each rowlibrary(dplyr)
library(purrr)
data %>%
  mutate(afterh = pmap_dbl(., ~ {x <- c(...)[3:5]; 
  first(sort(x[x > ..2]))}))
   id hosp attack1 attack2 attack3 out afterh
1 100    3       1       4       5   7      4
2 105    5       6       7      10  12      6
3 108    2       3       9      NA  11      3
4 200    6       4      10      NA  12     10
5 205    2       1      NA      NA   9     NA
As an alternative as mentioned by dear Mr. @Greg in a very large data set, we can use min function in place of first(sort)) combination to ensure a faster evaluation time of the following solution. In case there is no value greater than hosp like in the last row min function would return Inf so I made sure that it would return the value 0 instead you can change it with the value you prefer:
data %>%
  mutate(afterh = pmap_dbl(., ~ {x <- c(...)[3:5];
  out <- min(x[x > ..2], na.rm = TRUE);
  if(!is.finite(out)) 0 else out}))
   id hosp attack1 attack2 attack3 out afterh
1 100    3       1       4       5   7      4
2 105    5       6       7      10  12      6
3 108    2       3       9      NA  11      3
4 200    6       4      10      NA  12     10
5 205    2       1      NA      NA   9      0
                        data %>% 
  # Nest attack columns
  nest(attacks = starts_with('attack')) %>% 
  # Only one row at a time
  rowwise() %>% 
  # Find first instance for each row
  mutate(afterh = first(attacks[attacks > hosp])) %>% 
  # Unnest attacks
  unnest(attacks)
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With