I'm trying to create a flag column based on other columns in a data frame.
example:
df <- tribble(
~x1, ~x2, ~x3, ~x4,
1, 0, 1, 1,
0, 0, NA, NA,
1, 0, NA, 1,
0, 0, NA, NA,
0, 1, NA, 0
)
I want to create a flag column such that if the value 1 is present in any of the columns x1 ~ x4, then the value for the flag will be 1 and 0 otherwise.
res <- df |> mutate(flag = ifelse(if_any(x1:x4, function(x) x == 1), 1, 0))
I've tried using dplyr::if_any()
with ifelse()
, it seems to work for the most part, but for some reason it returns NA
in the case of false.
> res
# A tibble: 5 × 5
x1 x2 x3 x4 flag
<dbl> <dbl> <dbl> <dbl> <dbl>
1 1 0 1 1 1
2 0 0 NA NA NA
3 1 0 NA 1 1
4 0 0 NA NA NA
5 0 1 NA 0 1
why is this happening? What would be a better solution to this?
edit: I tried to see what the if_any()
function itself is returning and it seems like it returns NA
instead of false.
> res
# A tibble: 5 × 6
x1 x2 x3 x4 flag true_flase
<dbl> <dbl> <dbl> <dbl> <dbl> <lgl>
1 1 0 1 1 1 TRUE
2 0 0 NA NA NA NA
3 1 0 NA 1 1 TRUE
4 0 0 NA NA NA NA
5 0 1 NA 0 1 TRUE
per https://stackoverflow.com/a/44411169/10276092
You can use %in% instead of == to sort-of ignore NAs.
df %>% mutate(flag = ifelse(if_any(.cols=x1:x4, .fns= ~ . %in% 1), 1, 0))
Here is one way we could do it:
library(dplyr)
library(tidyr)
df %>%
rowwise %>%
mutate(flag = any(cur_data() == 1),
flag = replace_na(flag, 0))
x1 x2 x3 x4 flag
<dbl> <dbl> <dbl> <dbl> <lgl>
1 1 0 1 1 TRUE
2 0 0 NA NA FALSE
3 1 0 NA 1 TRUE
4 0 0 NA NA FALSE
5 0 1 NA 0 TRUE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With