Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

is ifelse ever appropriate in a non-vectorized situation and vice-versa?

Tags:

r

(Background info: ifelse evaluates both of the expressions, even though only one will be returned. EDIT: This is an incorrect statement. See Tommy's reply)

Is there any example where it makes sense to use ifelse in a non-vectorized situation? I think that "readability" could be a valid answer when we don't care about small efficiency gains, but besides that, is it ever faster/equivalent/better-in-some-other-way to use ifelse when an if and then else would do the job?

Similarly, if I have a vectorized situation, is ifelse always the best tool to use? It seems strange that both expressions are evaluated. Is it ever faster to loop through one by one and do a normal if and then else? I'm guessing it would make sense only if evaluating the expressions took a really long time. Is there any other alternative that would not involve an explicit loop?

Thanks

like image 916
Xu Wang Avatar asked Nov 18 '11 23:11

Xu Wang


2 Answers

First, ifelse does NOT always evaluate both expressions - only if there are both TRUE and FALSE elements in the test vector.

ifelse(TRUE, 'foo', stop('bar')) # "foo"

And in my opinion:

ifelse should not be used in a non-vectorized situation. It is always slower and more error prone to use ifelse over if / else:

# This is fairly common if/else code
if (length(letters) > 0) letters else LETTERS

# But this "equivalent" code will yield a very different result - TRY IT!
ifelse(length(letters) > 0, letters, LETTERS)

In vectorized situations though, ifelse can be a good choice - but beware that the length and attributes of the result might not be what you expect (as above, and I consider ifelse broken in that respect).

Here's an example: tst is of length 5 and has a class. I'd expect the result to be of length 10 and have no class, but that isn't what happens - it gets an incompatible class and length 5!

# a logical vector of class 'mybool'
tst <- structure(1:5 %%2 > 0, class='mybool')

# produces a numeric vector of class 'mybool'!
ifelse(tst, 101:110, 201:210)
#[1] 101 202 103 204 105
#attr(,"class")
#[1] "mybool"

Why would I expect the length to be 10? Because most functions in R "cycle" the shorter vector to match the longer:

1:5 + 1:10 # returns a vector of length 10.

...But ifelse only cycles the yes/no arguments to match the length of the tst argument.

Why would I expect the class (and other attributes) to not be copied from the test object? Because < which returns a logical vector does not copy class and attributes from its (typically numeric) arguments. It doesn't do that because it would typically be very wrong.

1:5 < structure(1:10, class='mynum') # returns a logical vector without class

Finally, can it be more efficient to "do it yourself"? Well, it seems that ifelse is not a primitive like if, and it needs some special code to handle NA. If you don't have NAs, it can be faster to do it yourself.

tst <- 1:1e7 %%2 == 0
a <- rep(1, 1e7)
b <- rep(2, 1e7)
system.time( r1 <- ifelse(tst, a, b) )            # 2.58 sec

# If we know that a and b are of the same length as tst, and that
# tst doesn't have NAs, then we can do like this:
system.time( { r2 <- b; r2[tst] <- a[tst]; r2 } ) # 0.46 secs

identical(r1, r2) # TRUE
like image 55
Tommy Avatar answered Oct 25 '22 13:10

Tommy


On your second point, how do you define "best"? I think ifelse() is one of the more readable solutions, but may not always be the fastest. Specifically, I've found that writing out boolean conditions and adding them together can give you some performance benefits. Here's a quick example:

> x <- rnorm(1e6)
> system.time(y1 <- ifelse(x > 0,1,2))
   user  system elapsed 
   0.46    0.08    0.53 
> system.time(y2 <- (x > 0) * 1 + (x <= 0) * 2)
   user  system elapsed 
   0.06    0.00    0.06 
> identical(y1, y2)
[1] TRUE

So, if speed is your biggest concern, the boolean approach may be better. However, for most of my purposes - I've found ifelse() quick enough and is easy to grok. Your miles may vary obviously.

like image 32
Chase Avatar answered Oct 25 '22 12:10

Chase