Why is apply() method slower than a for loop in R?

Tags:

As a matter of best practices, I'm trying to determine if it's better to create a function and apply() it across a matrix, or if it's better to simply loop a matrix through the function. I tried it both ways and was surprised to find apply() is slower. The task is to take a vector and evaluate it as either being positive or negative and then return a vector with 1 if it's positive and -1 if it's negative. The mash() function loops and the squish() function is passed to the apply() function.

million  <- as.matrix(rnorm(100000))  mash <- function(x){   for(i in 1:NROW(x))     if(x[i] > 0) {       x[i] <- 1     } else {       x[i] <- -1     }     return(x) }  squish <- function(x){   if(x >0) {     return(1)   } else {     return(-1)   } }   ptm <- proc.time() loop_million <- mash(million) proc.time() - ptm   ptm <- proc.time() apply_million <- apply(million,1, squish) proc.time() - ptm

loop_million results:

user  system elapsed  0.468   0.008   0.483

apply_million results:

user  system elapsed  1.401   0.021   1.423

What is the advantage to using apply() over a for loop if performance is degraded? Is there a flaw in my test? I compared the two resulting objects for a clue and found:

> class(apply_million) [1] "numeric" > class(loop_million) [1] "matrix"

Which only deepens the mystery. The apply() function cannot accept a simple numeric vector and that's why I cast it with as.matrix() in the beginning. But then it returns a numeric. The for loop is fine with a simple numeric vector. And it returns an object of same class as that one passed to it.

417

asked Apr 03 '11 23:04

Milktrader

1 Answers

The point of the apply (and plyr) family of functions is not speed, but expressiveness. They also tend to prevent bugs because they eliminate the book keeping code needed with loops.

Lately, answers on stackoverflow have over-emphasised speed. Your code will get faster on its own as computers get faster and R-core optimises the internals of R. Your code will never get more elegant or easier to understand on its own.

In this case you can have the best of both worlds: an elegant answer using vectorisation that is also very fast, (million > 0) * 2 - 1.

answered Oct 07 '22 07:10

hadley

Related questions
                            
                                Finding date range for current week, month and year
                            
                                How to programmatically "tap" a UITableView cell?
                            
                                Remove default group header icon of Expandable listview
                            
                                Display local image in UIImageView
                            
                                How to get a black status bar on an iPhone app?
                            
                                Sub Class a Backbone.View Sub Class & retain events
                            
                                How to extract text within a string of text
                            
                                In Selenium how do I find the "Current" object
                            
                                HTML5 Canvas and Line Width
                            
                                After installing Scala using MacPorts, scala command is not found
                            
                                My Android camera Uri is returning a null value, but the Samsung fix is in place, help?
                            
                                Setting "root" context path with Maven Jetty plugin

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With