I am interested in profiling some Rcpp code under OS X (Mountain Lion 10.8.2), but the profiler crashes when being run.
Toy example, using inline
, just designed to take enough time for a profiler to notice.
library(Rcpp) library(inline) src.cpp <- " RNGScope scope; int n = as<int>(n_); double x = 0.0; for ( int i = 0; i < n; i++ ) x += (unif_rand()-.5); return wrap(x);" src.c <- " int i, n = INTEGER(n_)[0]; double x = 0.0; GetRNGstate(); for ( i = 0; i < n; i++ ) x += (unif_rand()-.5); PutRNGstate(); return ScalarReal(x);" f.cpp <- cxxfunction(signature(n_="integer"), src.cpp, plugin="Rcpp") f.c <- cfunction(signature(n_="integer"), src.c)
If I use either the GUI Instruments (in Xcode, version 4.5 (4523)) or the command line sample
, both crash: Instruments crashes straight away, while sample completes processing samples before crashing:
# (in R) set.seed(1) f.cpp(200000000L) # (in a separate terminal window) ~ » sample R # this invokes the profiler Sampling process 81337 for 10 seconds with 1 millisecond of run time between samples Sampling completed, processing symbols... [1] 81654 segmentation fault sample 81337
If I do the same process but with the C version (i.e., f.c(200000000L)
) both Instruments and sample
work fine, and produce output like
Call graph: 1832 Thread_6890779 DispatchQueue_1: com.apple.main-thread (serial) 1832 start (in R) + 52 [0x100000e74] 1832 main (in R) + 27 [0x100000eeb] 1832 run_Rmainloop (in libR.dylib) + 80 [0x1000e4020] 1832 R_ReplConsole (in libR.dylib) + 161 [0x1000e3b11] 1832 Rf_ReplIteration (in libR.dylib) + 514 [0x1000e3822] 1832 Rf_eval (in libR.dylib) + 1010 [0x1000aa402] 1832 Rf_applyClosure (in libR.dylib) + 849 [0x1000af5d1] 1832 Rf_eval (in libR.dylib) + 1672 [0x1000aa698] 1832 do_dotcall (in libR.dylib) + 16315 [0x10007af3b] 1382 file1412f6e212474 (in file1412f6e212474.so) + 53 [0x1007fded5] file1412f6e212474.cpp:16 + 862 unif_rand (in libR.dylib) + 1127,1099,... [0x10000b057,0x10000b03b,...] + 520 fixup (in libR.dylib) + 39,67,... [0x10000aab7,0x10000aad3,...] 356 file1412f6e212474 (in file1412f6e212474.so) + 70,61,... [0x1007fdee6,0x1007fdedd,...] file1412f6e212474.cpp:16 56 unif_rand (in libR.dylib) + 1133 [0x10000b05d] 38 DYLD-STUB$$unif_rand (in file1412f6e212474.so) + 0 [0x1007fdf1c]
I would really appreciate some advice into if there is anything I'm doing wrong, if there is some other preferred way, or if this is just not possible. Given that one of the main uses of Rcpp seems to be in speeding up R code, I'm surprised not to find more information on this, but perhaps I'm looking in the wrong place.
This is on OS X 10.8.2 with R 2.15.1 (x86_64-apple-darwin9.8.0), Rcpp 0.9.15, and g++ --version reports "i686-apple-darwin11-llvm-g++-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)".
Thanks to Dirk's answer below, and his talk here http://dirk.eddelbuettel.com/papers/ismNov2009introHPCwithR.pdf, I have at least a partial solution using Google perftools. First, install from here http://code.google.com/p/gperftools/, and add -lprofiler to PKG_LIBS when compiling the C++ code. Then either
(a) Run R as CPUPROFILE=samples.log R
, run all code and quit (or use Rscript)
(b) Use two small utility functions to turn on/off profiling:
RcppExport SEXP start_profiler(SEXP str) { ProfilerStart(as<const char*>(str)); return R_NilValue; } RcppExport SEXP stop_profiler() { ProfilerStop(); return R_NilValue; }
Then, within R you can do
.Call("start_profiler", "samples.log") # code that calls C++ code to be profiled .Call("stop_profiler")
either way, the file samples.log
will contain profiling information. This can be looked at with
pprof --text /Library/Frameworks/R.framework/Resources/bin/exec/x86_64/R samples.log
which produces output like
Using local file /Library/Frameworks/R.framework/Resources/bin/exec/x86_64/R. Using local file samples.log. Removing __sigtramp from all stack traces. Total: 112 samples 64 57.1% 57.1% 64 57.1% _unif_rand 30 26.8% 83.9% 30 26.8% _process_system_Renviron 14 12.5% 96.4% 101 90.2% _for_profile 3 2.7% 99.1% 3 2.7% Rcpp::internal::expr_eval_methods 1 0.9% 100.0% 1 0.9% _Rf_PrintValueRec 0 0.0% 100.0% 1 0.9% 0x0000000102bbc1ff 0 0.0% 100.0% 15 13.4% 0x00007fff5fbfe06f 0 0.0% 100.0% 1 0.9% _Rf_InitFunctionHashing 0 0.0% 100.0% 1 0.9% _Rf_PrintValueEnv 0 0.0% 100.0% 112 100.0% _Rf_ReplIteration
which would probably be more informative on a real example.
I'm confused, your example is incomplete:
you don't show the (trivial) invocation of cfunction()
and cxxfunction()
you don't show how you invoke the profiler
you aren't profiling the C or C++ code (!!)
Can you maybe edit the question and make it clearer?
Also, when I run this, the two example do give identical speed results as they are essentially identical. [ Rcpp would let you do this in call with sugars random number functions. ]
R> library(Rcpp) R> library(inline) R> R> src.cpp <- " + RNGScope scope; + int n = as<int>(n_); + double x = 0.0; + for ( int i = 0; i < n; i++ ) + x += (unif_rand()-.5); + return wrap(x);" R> R> src.c <- " + int i, n = INTEGER(n_)[0]; + double x = 0.0; + GetRNGstate(); + for ( i = 0; i < n; i++ ) + x += (unif_rand()-.5); + PutRNGstate(); + return Rf_ScalarReal(x);" R> R> fc <- cfunction(signature(n_="int"), body=src.c) R> fcpp <- cxxfunction(signature(n_="int"), body=src.c, plugin="Rcpp") R> R> library(rbenchmark) R> R> print(benchmark(fc(10000L), fcpp(10000L))) test replications elapsed relative user.self sys.self user.child sys.child 1 fc(10000) 100 0.013 1 0.012 0 0 0 2 fcpp(10000) 100 0.013 1 0.012 0 0 0 R>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With