Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why a self-written Rcpp vectorized mathematical function is faster than its base counterpart?

Tags:

r

rcpp

exp

OK, I know the answer, but being inspired by this question, I'd like to get some nice opinions about the following: Why the Rcpp exercise below is ca. 15% faster (for long vectors) than the built-in exp()? We all know that Rcpp is a wrapper to the R/C API, so we should expect a slightly worse performance.

Rcpp::cppFunction("
   NumericVector exp2(NumericVector x) {
      NumericVector z = Rcpp::clone(x);
      int n = z.size();
      for (int i=0; i<n; ++i)
         z[i] = exp(z[i]);
      return z;
   }
")

library("microbenchmark")
x <- rcauchy(1000000)
microbenchmark(exp(x), exp2(x), unit="relative")
## Unit: relative
##     expr      min       lq   median       uq      max neval
##   exp(x) 1.159893 1.154143 1.155856 1.154482 0.926272   100
##  exp2(x) 1.000000 1.000000 1.000000 1.000000 1.000000   100
like image 526
gagolews Avatar asked Oct 17 '14 18:10

gagolews


People also ask

What is RCPP in R?

Rcpp is an R library allowing for easy integration of C++ code in your R workflow. It allows you to create optimized functions for when R just isn’t fast enough. It can also be used as a bridge between R and C++ giving you the ability to access the existing C++ libraries. Why use Rcpp?

Is vectorized code faster than loop?

At first: There is no evidence that vectorized code is faster in general. If a build-in function can be applied to a complete array, a vectorization is much faster than a loop appraoch.

What are the biggest bottlenecks in implementing code using RCPP?

This is meant as an example of implementing code using Rcpp, and calling that code from within R. Next I will show you a faster way of calculating this particular problem. The biggest bottleneck in this calculation is the summation over m.

What is the goal of programming in R?

The goal should be to program in R, use code profiling to find bottlenecks, and optimize those bottlenecks (either using Rcpp, or refactoring the code, or using mathematical tricks).


1 Answers

Base R tends to do more checking for NA so we can win a little by not doing that. Also note that by doing tricks like loop unrolling (as done in Rcpp Sugar) we can do little better still.

So I added

Rcpp::cppFunction("NumericVector expSugar(NumericVector x) { return exp(x); }")

and with that I get a further gain -- with less code on the user side:

R> microbenchmark(exp(x), exp2(x), expSugar(x), unit="relative")
Unit: relative
        expr     min      lq    mean  median      uq     max neval
      exp(x) 1.11190 1.11130 1.11718 1.10799 1.08938 1.02590   100
     exp2(x) 1.08184 1.08937 1.07289 1.07621 1.06382 1.00462   100
 expSugar(x) 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000   100
R> 
like image 53
Dirk Eddelbuettel Avatar answered Oct 13 '22 21:10

Dirk Eddelbuettel