This might be an easy one:
I like to create the condition "value in variableB or variableA".
What works is this:
var1 %in% c("value1", "value2")
condition: var1 is value1 or value2
var2 | var3 %in% 1
condition: var1 is 1 or var2 is 1 (var1 and var2 are dummies with 0/1)
With these I can get around the repetitive code:
var1 == "value1" | var1 == "value2"
and
var2 == 1 | var3 == 1
what I am looking to replace is
var4 == "value1" | var5 == "value1"
Reproducible example:
(I leave out var1-var3)
var4 <- c("value1", "valuex")
var5 <- c("valuey", "value1")
df <- data.frame(var4, var5)
I use case_when() from the dplyr package but it should work with the base R ifelse as well.
df <- df %>% mutate(newvar= case_when( CONDITION HERE ~ "value1",
TRUE~"else"))
if in var1 or var2 there is value1, the new variable should be value1
(First question on stackoverflow. Sorry for any unclarity.)
If we need to check whether 'value1' is present in any one of the columns in each row, use the filter_all
with any_vars
df %>%
filter_all(any_vars(. =="value1"))
For a specific subset of columns, use the filter_at
df %>%
filter_at(vars(matches("var\\d+")), any_vars(.== "value1"))
For creating a binary column based on multiple column comparison, use the mutate_at
(or mutate_all
if all columns needs to be compared), reduce
it to a single logical/integer vector and bind it as a column to create the new column in the dataset
library(dplyr)
library(purrr)
df %>%
mutate_at(vars(matches("var\\d+")), funs(.=="value1")) %>%
reduce(`|`) %>%
as.integer %>%
bind_cols(df, new_var = .)
Or as @Nick mentioned in the comments, we can use across
(dplyr
version >1.0.0
) instead of the deprecated mutate_at
df %>%
mutate(across(matches("var\\d+"), ~!is.na(.)))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With