Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to collapse related variables into a single based on a condition?

Tags:

r

dplyr

Lets say I have multiple variables that measure substance abuse i.e a1 is on alcohal usage, a2 is on bhang and a3 is on cocaine. I would like to generate variable afin that indicates engaged in substance abuse if any of the the three is yes.

Is there a way to shorten the code so I don't specify use multiple ifelse statements as below? Trying to find the best way to do it because I have more than 10 variables to collapse into one and writing ifelse may not be ideal.

# Anymatch
library(tidyverse)

set.seed(2021)

mydata <- tibble(
  a1 = factor(round(runif(20, 1, 3)),
              labels = c("Yes", "No", "N/A")),
  a2 = factor(round(runif(20, 1, 3)),
              labels = c("Yes", "No", "N/A")),
  a3 = factor(round(runif(20, 1, 3)),
              labels = c("Yes", "No", "N/A")),
  b1 = round(rnorm(20, 10, 2)))
mydata

mydata <- mydata %>%
  mutate(afin = ifelse(a1 == "Yes"|a2=="Yes"|a3=="Yes", "Yes", "No"))

like image 231
Moses Avatar asked Jun 09 '21 13:06

Moses


2 Answers

We could do this without an ifelse as well. Just convert the logical column to numeric index and pass a vector to replace the values

library(dplyr)
mydata %>%
     mutate(afin = c("no", "yes")[1 + (rowSums(select(cur_data(), 
        starts_with('a')) == 'Yes') > 0)])

-output

# A tibble: 20 x 5
   a1    a2    a3       b1 afin 
   <fct> <fct> <fct> <dbl> <chr>
 1 No    Yes   Yes       6 yes  
 2 N/A   N/A   N/A       7 no   
 3 No    No    No       12 no   
 4 No    No    N/A       7 no   
 5 No    No    Yes       9 yes  
 6 No    N/A   N/A       7 no   
 7 No    N/A   N/A       7 no   
 8 No    N/A   Yes       7 yes  
 9 N/A   N/A   Yes      10 yes  
10 N/A   N/A   N/A      11 no   
11 Yes   Yes   No       10 yes  
12 N/A   N/A   No       14 no   
13 No    N/A   Yes       9 yes  
14 No    N/A   No       14 no   
15 N/A   No    No       10 no   
16 No    Yes   Yes       8 yes  
17 No    N/A   No       13 no   
18 N/A   Yes   No        9 yes  
19 N/A   N/A   N/A      11 no   
20 No    No    N/A      11 no   

Or use c_across

mydata %>% 
   rowwise %>%
   mutate(afin = c("no", "yes")[1+ 
          any(c_across(starts_with('a')) == "Yes")]) %>% 
   ungroup
like image 133
akrun Avatar answered Oct 23 '22 12:10

akrun


We can also use the following solution:

library(dplyr)
library(purrr)

mydata %>% 
  mutate(afin = pmap_chr(mydata %>% select(where(is.factor)), 
                         ~ {if(any(c(...) == "Yes")) "Yes" else "No"}))


# A tibble: 20 x 5
   a1    a2    a3       b1 afin 
   <fct> <fct> <fct> <dbl> <chr>
 1 No    Yes   Yes       6 Yes  
 2 N/A   N/A   N/A       7 No   
 3 No    No    No       12 No   
 4 No    No    N/A       7 No   
 5 No    No    Yes       9 Yes  
 6 No    N/A   N/A       7 No   
 7 No    N/A   N/A       7 No   
 8 No    N/A   Yes       7 Yes  
 9 N/A   N/A   Yes      10 Yes  
10 N/A   N/A   N/A      11 No   
11 Yes   Yes   No       10 Yes  
12 N/A   N/A   No       14 No   
13 No    N/A   Yes       9 Yes  
14 No    N/A   No       14 No   
15 N/A   No    No       10 No   
16 No    Yes   Yes       8 Yes  
17 No    N/A   No       13 No   
18 N/A   Yes   No        9 Yes  
19 N/A   N/A   N/A      11 No   
20 No    No    N/A      11 No 
like image 27
Anoushiravan R Avatar answered Oct 23 '22 10:10

Anoushiravan R