Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Profiling Rcpp code on OS X

Tags:

r

profiling

rcpp

I am interested in profiling some Rcpp code under OS X (Mountain Lion 10.8.2), but the profiler crashes when being run.

Toy example, using inline, just designed to take enough time for a profiler to notice.

library(Rcpp) library(inline)  src.cpp <- "   RNGScope scope;   int n = as<int>(n_);   double x = 0.0;   for ( int i = 0; i < n; i++ )     x += (unif_rand()-.5);   return wrap(x);"  src.c <- "   int i, n = INTEGER(n_)[0];   double x = 0.0;   GetRNGstate();   for ( i = 0; i < n; i++ )     x += (unif_rand()-.5);   PutRNGstate();   return ScalarReal(x);"  f.cpp <- cxxfunction(signature(n_="integer"), src.cpp, plugin="Rcpp") f.c <- cfunction(signature(n_="integer"), src.c) 

If I use either the GUI Instruments (in Xcode, version 4.5 (4523)) or the command line sample, both crash: Instruments crashes straight away, while sample completes processing samples before crashing:

# (in R) set.seed(1) f.cpp(200000000L)  # (in a separate terminal window) ~ » sample R  # this invokes the profiler Sampling process 81337 for 10 seconds with 1 millisecond of run time between samples Sampling completed, processing symbols... [1]    81654 segmentation fault  sample 81337 

If I do the same process but with the C version (i.e., f.c(200000000L)) both Instruments and sample work fine, and produce output like

Call graph: 1832 Thread_6890779   DispatchQueue_1: com.apple.main-thread  (serial)   1832 start  (in R) + 52  [0x100000e74]     1832 main  (in R) + 27  [0x100000eeb]       1832 run_Rmainloop  (in libR.dylib) + 80  [0x1000e4020]         1832 R_ReplConsole  (in libR.dylib) + 161  [0x1000e3b11]           1832 Rf_ReplIteration  (in libR.dylib) + 514  [0x1000e3822]             1832 Rf_eval  (in libR.dylib) + 1010  [0x1000aa402]               1832 Rf_applyClosure  (in libR.dylib) + 849  [0x1000af5d1]                 1832 Rf_eval  (in libR.dylib) + 1672  [0x1000aa698]                   1832 do_dotcall  (in libR.dylib) + 16315  [0x10007af3b]                     1382 file1412f6e212474  (in file1412f6e212474.so) + 53  [0x1007fded5]  file1412f6e212474.cpp:16                     + 862 unif_rand  (in libR.dylib) + 1127,1099,...  [0x10000b057,0x10000b03b,...]                     + 520 fixup  (in libR.dylib) + 39,67,...  [0x10000aab7,0x10000aad3,...]                     356 file1412f6e212474  (in file1412f6e212474.so) + 70,61,...  [0x1007fdee6,0x1007fdedd,...]  file1412f6e212474.cpp:16                     56 unif_rand  (in libR.dylib) + 1133  [0x10000b05d]                     38 DYLD-STUB$$unif_rand  (in file1412f6e212474.so) + 0  [0x1007fdf1c] 

I would really appreciate some advice into if there is anything I'm doing wrong, if there is some other preferred way, or if this is just not possible. Given that one of the main uses of Rcpp seems to be in speeding up R code, I'm surprised not to find more information on this, but perhaps I'm looking in the wrong place.

This is on OS X 10.8.2 with R 2.15.1 (x86_64-apple-darwin9.8.0), Rcpp 0.9.15, and g++ --version reports "i686-apple-darwin11-llvm-g++-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)".

A solution

Thanks to Dirk's answer below, and his talk here http://dirk.eddelbuettel.com/papers/ismNov2009introHPCwithR.pdf, I have at least a partial solution using Google perftools. First, install from here http://code.google.com/p/gperftools/, and add -lprofiler to PKG_LIBS when compiling the C++ code. Then either

(a) Run R as CPUPROFILE=samples.log R, run all code and quit (or use Rscript)

(b) Use two small utility functions to turn on/off profiling:

RcppExport SEXP start_profiler(SEXP str) {   ProfilerStart(as<const char*>(str));   return R_NilValue; }  RcppExport SEXP stop_profiler() {   ProfilerStop();   return R_NilValue; } 

Then, within R you can do

.Call("start_profiler", "samples.log") # code that calls C++ code to be profiled .Call("stop_profiler") 

either way, the file samples.log will contain profiling information. This can be looked at with

pprof --text /Library/Frameworks/R.framework/Resources/bin/exec/x86_64/R samples.log 

which produces output like

Using local file /Library/Frameworks/R.framework/Resources/bin/exec/x86_64/R. Using local file samples.log. Removing __sigtramp from all stack traces. Total: 112 samples   64  57.1%  57.1%       64  57.1% _unif_rand   30  26.8%  83.9%       30  26.8% _process_system_Renviron   14  12.5%  96.4%      101  90.2% _for_profile    3   2.7%  99.1%        3   2.7% Rcpp::internal::expr_eval_methods    1   0.9% 100.0%        1   0.9% _Rf_PrintValueRec    0   0.0% 100.0%        1   0.9% 0x0000000102bbc1ff    0   0.0% 100.0%       15  13.4% 0x00007fff5fbfe06f    0   0.0% 100.0%        1   0.9% _Rf_InitFunctionHashing    0   0.0% 100.0%        1   0.9% _Rf_PrintValueEnv    0   0.0% 100.0%      112 100.0% _Rf_ReplIteration 

which would probably be more informative on a real example.

like image 756
Rich Avatar asked Nov 05 '12 00:11

Rich


1 Answers

I'm confused, your example is incomplete:

  • you don't show the (trivial) invocation of cfunction() and cxxfunction()

  • you don't show how you invoke the profiler

  • you aren't profiling the C or C++ code (!!)

Can you maybe edit the question and make it clearer?

Also, when I run this, the two example do give identical speed results as they are essentially identical. [ Rcpp would let you do this in call with sugars random number functions. ]

R> library(Rcpp) R> library(inline) R>  R> src.cpp <- " +   RNGScope scope; +   int n = as<int>(n_); +   double x = 0.0; +   for ( int i = 0; i < n; i++ ) +     x += (unif_rand()-.5); +   return wrap(x);" R>  R> src.c <- " +   int i, n = INTEGER(n_)[0]; +   double x = 0.0; +   GetRNGstate(); +   for ( i = 0; i < n; i++ ) +     x += (unif_rand()-.5); +   PutRNGstate(); +   return Rf_ScalarReal(x);" R>  R> fc   <- cfunction(signature(n_="int"), body=src.c) R> fcpp <- cxxfunction(signature(n_="int"), body=src.c, plugin="Rcpp") R>  R> library(rbenchmark) R>  R> print(benchmark(fc(10000L), fcpp(10000L)))          test replications elapsed relative user.self sys.self user.child sys.child 1   fc(10000)          100   0.013        1     0.012        0          0         0 2 fcpp(10000)          100   0.013        1     0.012        0          0         0 R>  
like image 132
Dirk Eddelbuettel Avatar answered Oct 09 '22 07:10

Dirk Eddelbuettel