Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Running compiled C++ code with Rcpp

Tags:

c++

r

rcpp

I have been working my way through Dirk Eddelbuettel's Rcpp tutorial here:

http://www.rinfinance.com/agenda/

I have learned how to save a C++ file in a directory and call it and run it from within R. The C++ file I am running is called 'logabs2.ccp' and its contents are directly from one of Dirk's slides:

#include <Rcpp.h>

using namespace Rcpp;

inline double f(double x) { return ::log(::fabs(x)); }

// [[Rcpp::export]]
std::vector<double> logabs2(std::vector<double> x) {
    std::transform(x.begin(), x.end(), x.begin(), f);
    return x;
}

I run it with this R code:

library(Rcpp)
sourceCpp("c:/users/mmiller21/simple r programs/logabs2.cpp")
logabs2(seq(-5, 5, by=2))
# [1] 1.609438 1.098612 0.000000 0.000000 1.098612 1.609438

I am running the code on a Windows 7 machine from within the R GUI that seems to install by default. I also installed the most recent version of Rtools. The above R code seems to take a relatively long time to run. I suspect most of that time is devoted to compiling the C++ code and that once the C++ code is compiled it runs very quickly. Microbenchmark certainly suggests that Rcpp reduces computation time.

I have never used C++ until now, but I know that when I compile C code I get an *.exe file. I have searched my hard-drive from a file called logabs2.exe but cannot find one. I am wondering whether the above C++ code might run even faster if a logabs2.exe file was created. Is it possible to create a logabs2.exe file and store it in a folder somewhere and then have Rcpp call that file whenever I wanted to use it? I do not know whether that makes sense. If I could store a C++ function in an *.exe file then perhaps I would not have to compile the function every time I wanted to use it with Rcpp and then perhaps the Rcpp code would be even faster.

Sorry if this question does not make sense or is a duplicate. If it is possible to store the C++ function as an *.exe file I am hoping someone will show me how to modify my R code above to run it. Thank you for any help with this or for setting me straight on why what I suggest is not possible or recommended.

I look forward to seeing Dirk's new book.

like image 860
Mark Miller Avatar asked Jun 17 '13 13:06

Mark Miller


People also ask

Can C code be run by CPP?

Accessing C Code from Within C++ SourceAll C++ compilers also support C linkage, for some compatible C compiler. When you need to access a function compiled with C linkage (for example, a function compiled by the C compiler, or a function written in assembler), declare the function to have C linkage.

How do you call C code in R?

The most basic method for calling C code from R is to use the . C() function described in the System and foreign language interfaces section of the Writing R Extensions manual.

What is RCPP function?

Description The 'Rcpp' package provides R functions as well as C++ classes which offer a seamless integration of R and C++. Many R data types and objects can be mapped back and forth to C++ equivalents which facilitates both writing of new code as well as easier integration of third-party libraries.


2 Answers

Thank you to user1981275, Dirk Eddelbuettel and Romain Francois for their responses. Below is how I compiled a C++ file and created a *.dll, then called and used that *.dll file inside R.

Step 1. I created a new folder called 'c:\users\mmiller21\myrpackages' and pasted the file 'logabs2.cpp' into that new folder. The file 'logabs2.cpp' was created as described in my original post.

Step 2. Inside the new folder I created a new R package called 'logabs2' using an R file I wrote called 'new package creation.r'. The contents of 'new package creation.r' are:

setwd('c:/users/mmiller21/myrpackages/')

library(Rcpp)

Rcpp.package.skeleton("logabs2", example_code = FALSE, cpp_files = c("logabs2.cpp"))

I found the above syntax for Rcpp.package.skeleton on one of Hadley Wickham's websites: https://github.com/hadley/devtools/wiki/Rcpp

Step 3. I installed the new R package "logabs2" in R using the following line in the DOS command window:

C:\Program Files\R\R-3.0.1\bin\x64>R CMD INSTALL -l c:\users\mmiller21\documents\r\win-library\3.0\ c:\users\mmiller21\myrpackages\logabs2

where:

the location of the rcmd.exe file is:

C:\Program Files\R\R-3.0.1\bin\x64>

the location of installed R packages on my computer is:

c:\users\mmiller21\documents\r\win-library\3.0\

and the location of my new R package prior to being installed is:

c:\users\mmiller21\myrpackages\

Syntax used in the DOS command window was found by trial and error and may not be ideal. At some point I pasted a copy of 'logabs2.cpp' in 'C:\Program Files\R\R-3.0.1\bin\x64>' but I do not think that mattered.

Step 4. After installing the new R package I ran it using an R file I named 'new package usage.r' in the 'c:/users/mmiller21/myrpackages/' folder (although I do not think the folder was important). The contents of 'new package usage.r' are:

library(logabs2)
logabs2(seq(-5, 5, by=2))

The output was:

# [1] 1.609438 1.098612 0.000000 0.000000 1.098612 1.609438

This file loaded the package Rcpp without me asking.

In this case base R was faster assuming I did this correctly.

#> microbenchmark(logabs2(seq(-5, 5, by=2)), times = 100)
#Unit: microseconds
#                        expr    min     lq  median     uq     max neval
# logabs2(seq(-5, 5, by = 2)) 43.086 44.453 50.6075 69.756 190.803   100

#> microbenchmark(log(abs(seq(-5, 5, by=2))), times=100)
#Unit: microseconds
#                         expr    min     lq median    uq     max neval
# log(abs(seq(-5, 5, by = 2))) 38.298 38.982 39.666 40.35 173.023   100

However, using the dll file was faster than calling the external cpp file:

system.time(

cppFunction("
NumericVector logabs(NumericVector x) {
    return log(abs(x));
}
")

)

#   user  system elapsed 
#   0.06    0.08    5.85 

Although base R seems faster or as fast as the *.dll file in this case, I have no doubt that using the *.dll file with Rcpp will be faster than base R in most cases.

This was my first attempt creating an R package or using Rcpp and no doubt I did not use the most efficient methods. Also, I apologize for any typographic errors in this post.

EDIT

In a comment below I think Romain Francois suggested I modify the *.cpp file to the following:

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]

NumericVector logabs(NumericVector x) {
return log(abs(x));
}

and recreate my R package, which I have now done. I then compared base R against my new package using the following code:

library(logabs)

logabs(seq(-5, 5, by=2))
log(abs(seq(-5, 5, by=2)))

library(microbenchmark)

microbenchmark(logabs(seq(-5, 5, by=2)), log(abs(seq(-5, 5, by=2))), times = 100000)

Base R is still a tiny bit faster or no different:

Unit: microseconds
                         expr    min     lq median     uq       max neval
   logabs(seq(-5, 5, by = 2)) 42.401 45.137 46.505 69.073 39754.598 1e+05
 log(abs(seq(-5, 5, by = 2))) 37.614 40.350 41.718 62.234  3422.133 1e+05

Perhaps this is because base R is already vectorized. I suspect with more complex functions base R will be much slower. Or perhaps I am still not using the most efficient approach, or perhaps I simply made an error somewhere.

like image 177
Mark Miller Avatar answered Sep 20 '22 12:09

Mark Miller


You say

I have never used C++ until now, but I know that when I compile C code I get an *.exe file

and that is true if and only you build an executable. Here, we build dynamically loadable libraries and those thend to have different extensionos depending on the operating system: .dll for Windoze, .so for Linux, .dynlib for OS X.

So nothing wrong here, you simply had the wrong assumption.

like image 37
Dirk Eddelbuettel Avatar answered Sep 21 '22 12:09

Dirk Eddelbuettel