When I was re-reading Hadley's Advanced R recently, I noticed that he said in Chapter 6 that `if`
can be used as a function like
`if`(i == 1, print("yes"), print("no"))
(If you have the physical book in hand, it's on Page 80)
We know that ifelse
is slow (Does ifelse really calculate both of its vectors every time? Is it slow?) as it evaluates all arguments. Will `if`
be a good alternative to that as if
seems to only evaluate TRUE
arguments (this is just my assumption)?
Update: Based on the answers from @Benjamin and @Roman and the comments from @Gregor and many others, ifelse
seems to be a better solution for vectorized calculations. I'm taking @Benjamin's answer here as it provides a more comprehensive comparison and for the community wellness. However, both answers(and the comments) are worth reading.
Switch is generally faster than a long list of ifs because the compiler can generate a jump table. The longer the list, the better a switch statement is over a series of if statements. Copy link CC BY-SA 2.5. Follow this answer to receive notifications.
The MySQL CASE statement is faster in comparison to PHP if statement. The PHP if statement takes too much time because it loads data and then process while CASE statement does not.
In general, "else if" style can be faster because in the series of ifs, every condition is checked one after the other; in an "else if" chain, once one condition is matched, the rest are bypassed.
We know that ifelse is slow (Does ifelse really calculate both of its vectors every time? Is it slow?) as it evaluates all arguments.
This is more of an extended comment building on Roman's answer, but I need the code utilities to expound:
Roman is correct that if
is faster than ifelse
, but I am under the impression that the speed boost of if
isn't particularly interesting since it isn't something that can easily be harnessed through vectorization. That is to say, if
is only advantageous over ifelse
when the cond
/test
argument is of length 1.
Consider the following function which is an admittedly weak attempt at vectorizing if
without having the side effect of evaluating both the yes
and no
conditions as ifelse
does.
ifelse2 <- function(test, yes, no){
result <- rep(NA, length(test))
for (i in seq_along(test)){
result[i] <- `if`(test[i], yes[i], no[i])
}
result
}
ifelse2a <- function(test, yes, no){
sapply(seq_along(test),
function(i) `if`(test[i], yes[i], no[i]))
}
ifelse3 <- function(test, yes, no){
result <- rep(NA, length(test))
logic <- test
result[logic] <- yes[logic]
result[!logic] <- no[!logic]
result
}
set.seed(pi)
x <- rnorm(1000)
library(microbenchmark)
microbenchmark(
standard = ifelse(x < 0, x^2, x),
modified = ifelse2(x < 0, x^2, x),
modified_apply = ifelse2a(x < 0, x^2, x),
third = ifelse3(x < 0, x^2, x),
fourth = c(x, x^2)[1L + ( x < 0 )],
fourth_modified = c(x, x^2)[seq_along(x) + length(x) * (x < 0)]
)
Unit: microseconds
expr min lq mean median uq max neval cld
standard 52.198 56.011 97.54633 58.357 68.7675 1707.291 100 ab
modified 91.787 93.254 131.34023 94.133 98.3850 3601.967 100 b
modified_apply 645.146 653.797 718.20309 661.568 676.0840 3703.138 100 c
third 20.528 22.873 76.29753 25.513 27.4190 3294.350 100 ab
fourth 15.249 16.129 19.10237 16.715 20.9675 43.695 100 a
fourth_modified 19.061 19.941 22.66834 20.528 22.4335 40.468 100 a
SOME EDITS: Thanks to Frank and Richard Scriven for noticing my shortcomings.
As you can see, the process of breaking up the vector to be suitable to pass to if
is a time consuming process and ends up being slower than just running ifelse
(which is probably why no one has bothered to implement my solution).
If you're really desperate for an increase in speed, you can use the ifelse3
approach above. Or better yet, Frank's less obvious* but brilliant solution.
yes
and no
have length 1, otherwise you'll want to stick with ifelse3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With