R: apply vs do.call

Tags:

I just read the profile of @David Arenburg, and found a bunch of useful tips for how to develop good R-programming skills/habits, and one especially struck me. I have always thought that the apply functions in R was the cornerstone of working with dataframes, but he writes:

If you are working with data.frames, forget there is a function called apply- whatever you do - don't use it. Especially with a margin of 1 (the only good usecase for this function is to operate over matrix columns- margin of 2).

Some good alternatives: ?do.call, ?pmax/pmin, ?max.col, ?rowSums/rowMeans/etc, the awesome matrixStats packages (for matrices), ?rowsum and many more

Could anybody explain this to me? Why are apply functions frowned upon?

517

asked Jun 06 '18 09:06

Helen

2 Answers

apply(DF, 1, f) converts each row of DF to a vector and then passes that vector to f. If DF were a mix of strings and numbers then the row would be converted to a character vector before passing it to f so that, for example, apply(iris, 1, function(x) sum(x[-5])) will not work even though the row iris[i, -5] contains all numeric elements. The row is converted to character string and you can't sum character strings. On the other hand apply(iris[-5], 1, sum) will work the same as rowSums(iris[-5]).
if f produces a vector the result is a matrix and not another data frame; also, the result is the transpose of what you might expect. This
```
apply(BOD, 1, identity)
```
gives the following rather than giving BOD back:
```
       [,1] [,2] [,3] [,4] [,5] [,6]
Time    1.0  2.0    3    4  5.0  7.0
demand  8.3 10.3   19   16 15.6 19.8
```
Many years ago Hadley Wickham did post iapply which is idempotent in the sense that iapply(mat, 1, identity) returns mat, rather than t(mat), where mat is a matrix. More recently with his plyr package one can write:
```
library(plyr)
ddplyr(BOD, 1, identity)
```
and get BOD back as a data frame.

On the other hand apply(BOD, 1, sum) will give the same result as rowSums(BOD) and apply(BOD, 1, f) might be useful for functions f for which f produces a scalar and there is no counterpart such as in the sum / rowSums case. Also if f produces a vector and you don't mind a matrix result you can transpose the output of apply yourself and although ugly it would work.

answered Oct 06 '22 01:10

G. Grothendieck

I think what the author means, is that you should use pre-built/vectorized functions (because it is easier), if you can and avoid apply (because in principle it is a for loop and takes longer):

library(microbenchmark)

d <- data.frame(a = rnorm(10, 10, 1),
                b = rnorm(10, 200, 1))

# bad - loop
microbenchmark(apply(d, 1, function(x) if (x[1] < x[2]) x[1] else x[2]))

# good - vectorized but same result
microbenchmark(pmin(d[[1]], d[[2]])) # use double brackets!

# edited:
# -------
# bad: lapply
microbenchmark(data.frame(lapply(d, round, 1)))

# good: do.call faster than lapply
microbenchmark(do.call("round", list(d, digits = 1)))

# --------------
# Unit: microseconds
#                                  expr     min    lq     mean  median      uq     max neval
# do.call("round", list(d, digits = 1)) 104.422 107.1 148.3419 134.767 184.524 332.009   100
#                            expr     min       lq     mean  median      uq      max neval
# data.frame(lapply(d, round, 1)) 235.619 243.2055 298.5042 252.353 276.004 1550.265   100
#
#                                  expr    min      lq    mean median       uq     max neval
# do.call("round", list(d, digits = 1)) 96.389 97.5055 113.075 98.175 105.5375 730.954   100
#                            expr     min       lq     mean  median      uq      max neval
# data.frame(lapply(d, round, 1)) 235.619 243.2055 298.5042 252.353 276.004 1550.265   100

answered Oct 06 '22 02:10

r.user.05apr

Related questions
                            
                                calculating simple retention in R
                            
                                Build a file diagram for an R code
                            
                                Interpreting Alias table testing multicollinearity of model in R
                            
                                How can I force ggplot to show more levels on the legend?
                            
                                Lower trailing parts of letters "g" and "y" etc hidden/cut off/overwritten in ggplot labels
                            
                                Shiny leaflet easyPrint plugin
                            
                                Include TikZ code in bookdown figure environment
                            
                                knitr: add to previous plot in new code chunk
                            
                                Creating a vertical color gradient for a geom_bar plot
                            
                                How dnorm works?
                            
                                Chunk option class.output is not working on Error Message
                            
                                Split vector into balanced list (balancing sum of list elements)
                            
                                Recursive function using dplyr
                            
                                How to install Rjags on Ubuntu
                            
                                Total of a column in DT dataTables in shiny
                            
                                Multi replace values according to template
                            
                                How to customize hover text for plotly boxplots in R
                            
                                Advanced stacked bar chart ggplot2
                            
                                How to use dplyr to calculate a weighted mean of two grouped variables
                            
                                Draw a parallel line in R offset from a line

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

R: apply vs do.call

Tags:

r

apply

do.call

Helen

People also ask

2 Answers

G. Grothendieck

r.user.05apr

Recent Activity

Donate For Us