I have just started using the Rcpp
package in R, my learning is inspired by the Advanced R course by Hadley Wickham.
Within R studio I have the following .cpp file. The question is more general but this example helps.
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector runifC(int n, double min=0, double max=1) {
NumericVector out(n);
for(int i = 0; i < n; ++i) {
out[i] = min + ((double) rand() / (RAND_MAX)) * (max - min);
}
return out;
}
/*** R
library(microbenchmark)
microbenchmark(
'R unif-1' = runif(1),
'C++ unif-1' = runifC(1),
'R unif-100' = runif(100),
'C++ unif-100' = runifC(100),
'R unif-1000' = runif(1000),
'C++ unif-1000' = runifC(1000),
'R unif-100000' = runif(100000),
'C++ unif-100000' = runifC(100000)
)
*/
When I source/save the file it shows me the performance output.
Unit: nanoseconds
expr min lq mean median uq max neval
R unif-1 2061 2644.5 4000.71 3456.0 4297.0 15402 100
C++ unif-1 710 1190.0 1815.11 1685.0 2168.5 5776 100
R unif-100 4717 5566.5 6794.14 6563.0 7435.5 16600 100
C++ unif-100 1450 1997.5 2663.29 2591.5 3107.0 5307 100
R unif-1000 28210 29584.5 31310.54 30380.0 31599.0 52879 100
C++ unif-1000 8292 8951.0 10113.78 9462.5 10121.5 25099 100
R unif-100000 2642581 2975117.0 3104580.62 3030938.5 3119489.0 5435046 100
C++ unif-100000 699833 990924.0 1058855.49 1034430.5 1075078.0 1530351 100
I would expect that runif
would be a very optimised function but the C++ code runs much more efficiently. I might be naive here, but if there is such a difference in performance then why aren't all applicable R functions rewritten in C++?
It seems so obvious that there are a lot of improvements possible that I feel as if I am missing a huge reason of why not all R functions can be blindly copied to C++ for performance.
edit: for this example it has been shown that the C++ implementation of rand()
is slightly flawed. the performance gap that I noticed most used the rand()
function. performance of other functions doesn't seem as drastic so i changed the name of the question.
Please DO NOT USE rand()
. Doing so will kick your package off CRAN too should you submit it.
See eg this C++ reference page for a warning:
Notes
There are no guarantees as to the quality of the random sequence produced. In the past, some implementations of rand() have had serious shortcomings in the randomness, distribution and period of the sequence produced (in one well-known example, the low-order bit simply alternated between 1 and 0 between calls).
If you are interested in alternate random number generators and timing, the Rcpp Gallery.
In general, use the generators provided by R which are of excellent statistical quality, and offered in both scalar and vectorised form ("Rcpp Sugar") by Rcpp.
As of R-3.1.1, runif
uses the .External
interface, which copies its arguments. Luke Tierney changed this to use the .Call
interface in R-devel in revision 66110. The .Call
interface does not copy its arguments. Rcpp uses the .Call
interface.
Your C++ code is still faster under R-devel (using the .Call
interface). This is likely because of differences in the random number generator being used. Also, R's functions will generally have more checks than whatever specialized code you write; and those checks take time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With