Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Condition in ifelse: Value in multiple columns/variables

This might be an easy one:

I like to create the condition "value in variableB or variableA".

What works is this:

var1 %in% c("value1", "value2") condition: var1 is value1 or value2

var2 | var3 %in% 1 condition: var1 is 1 or var2 is 1 (var1 and var2 are dummies with 0/1)

With these I can get around the repetitive code:

var1 == "value1" | var1 == "value2"

and

var2 == 1 | var3 == 1

what I am looking to replace is

var4 == "value1" | var5 == "value1"

Reproducible example:

(I leave out var1-var3)

var4 <- c("value1", "valuex")
var5 <- c("valuey", "value1")

df <- data.frame(var4, var5)

I use case_when() from the dplyr package but it should work with the base R ifelse as well.

df <- df %>% mutate(newvar= case_when( CONDITION HERE ~ "value1", 
                     TRUE~"else"))

if in var1 or var2 there is value1, the new variable should be value1

(First question on stackoverflow. Sorry for any unclarity.)

like image 659
ps_r Avatar asked Jan 11 '18 11:01

ps_r


1 Answers

If we need to check whether 'value1' is present in any one of the columns in each row, use the filter_all with any_vars

df %>%
  filter_all(any_vars(. =="value1"))

For a specific subset of columns, use the filter_at

df %>%
   filter_at(vars(matches("var\\d+")), any_vars(.== "value1"))

For creating a binary column based on multiple column comparison, use the mutate_at (or mutate_all if all columns needs to be compared), reduce it to a single logical/integer vector and bind it as a column to create the new column in the dataset

library(dplyr)
library(purrr)
df %>% 
  mutate_at(vars(matches("var\\d+")), funs(.=="value1")) %>% 
  reduce(`|`) %>%
  as.integer %>%
  bind_cols(df, new_var = .)

Or as @Nick mentioned in the comments, we can use across (dplyr version >1.0.0) instead of the deprecated mutate_at

df %>%
   mutate(across(matches("var\\d+"), ~!is.na(.)))
like image 114
akrun Avatar answered Sep 21 '22 20:09

akrun