dplyr if_else() vs base R ifelse()

Tags:

I am fairly proficient within the Tidyverse, but have always used ifelse() instead of dplyr if_else(). I want to switch this behavior and default to always using dplyr::if_else() and deprecating ifelse() from my code.

Is there any reason not to do this? Would this likely get me into trouble? I'll spare you the details, but recently, not using if_else() screwed me up, when I unknowingly created a column of character matrices in my data analysis. If I switch to always using if_else() I hope to avoid this issue in the future.

896

asked Jun 01 '18 14:06

stackinator

4 Answers

if_else is more strict. It checks that both alternatives are of the same type and otherwise throws an error, while ifelse will promote types as necessary. This may be a benefit in some circumstances, but may otherwise break scripts if you don't check for errors or explicitly force type conversion. For example:

ifelse(c(TRUE,TRUE,FALSE),"a",3)
[1] "a" "a" "3"
if_else(c(TRUE,TRUE,FALSE),"a",3)
Error: `false` must be type character, not double

answered Oct 04 '22 14:10

James

Another reason to choose if_else over ifelse is that ifelse turns Date into numeric objects

Dates <- as.Date(c('2018-10-01', '2018-10-02', '2018-10-03'))
new_Dates <- ifelse(Dates == '2018-10-02', Dates + 1, Dates)
str(new_Dates)

#>  num [1:3] 17805 17807 17807

if_else is also faster than ifelse.

Note that when testing multiple conditions, the code would be more readable and less error-prone if we use case_when.

library(dplyr)

case_when(
  Dates == '2018-10-01' ~ Dates - 1,
  Dates == '2018-10-02' ~ Dates + 1,
  Dates == '2018-10-03' ~ Dates + 2,
  TRUE ~ Dates
)

#> [1] "2018-09-30" "2018-10-03" "2018-10-05"

Created on 2018-06-01 by the reprex package (v0.2.0).

answered Oct 04 '22 16:10

Tung

I'd also add that if_else() can attribute a value in case of NA, which is a handy way of adding an extra condition.

df <- data_frame(val = c(80, 90, NA, 110))
df %>% mutate(category = if_else(val < 100, 1, 2, missing = 9))

#     val category
#   <dbl>    <dbl>
# 1    80        1
# 2    90        1
# 3    NA        9
# 4   110        2

answered Oct 04 '22 16:10

Joe

Another important reason for preferring if_else() to ifelse() is checking for consistency in lengths. See this dangerous gotcha:

> tibble(x = 1:3, y = ifelse(TRUE, x, 4:6))
# A tibble: 3 x 2
      x     y
  <int> <int>
1     1     1
2     2     1
3     3     1

Compare with

> tibble(x = 1:3, y = if_else(TRUE, x, 4:6))
    Error: `true` must be length 1 (length of `condition`), not 3.

The intention in both cases is clearly for column y to equal x or to equal 4:6 acording to the value of a single (scalar) logical variable; ifelse() silently truncates its output to length 1, which is then silently recycled. if_else() catches what is almost certainly an error at source.

answered Oct 04 '22 16:10

ChrisW

Related questions
                            
                                How to convert the name of a dataframe to a string in R?
                            
                                Complicated reshaping
                            
                                Convert hours:minutes:seconds to minutes
                            
                                Line breaks in R Markdown text (not code blocks)
                            
                                How can I prevent a library from masking functions
                            
                                How to replace empty string with NA in R dataframe?
                            
                                Sort data frame column by factor
                            
                                Three dimensional array to list
                            
                                How do I combine aes() and aes_string() options
                            
                                rmarkdown error "attempt to use zero-length variable name"
                            
                                More efficient R / Sweave / TeXShop work-flow?
                            
                                How do I add the mean value to a histogram in R?
                            
                                Read csv from specific row
                            
                                How do I generate a histogram for each column of my table?
                            
                                Add missing value in column with value from row above
                            
                                Joining aggregated values back to the original data frame [duplicate]
                            
                                How to fill NAs with LOCF by factors in data frame, split by country
                            
                                Difference between the == and %in% operators in R [duplicate]
                            
                                How to find the difference in value in every two consecutive rows in R?
                            
                                Fill in data frame with values from rows above

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

dplyr if_else() vs base R ifelse()

Tags:

r

if-statement

dplyr

stackinator

People also ask

4 Answers

James

Tung

Joe

ChrisW

Recent Activity

Donate For Us