Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is my C++ implementation slower than the R source?

Tags:

c++

r

rcpp

I tried to implement charToRaw function with Rcpp. C_charToRaw below is copied from the R source.

C++ code:

#include <Rcpp.h>
#include <Rinternals.h>

// [[Rcpp::export]]
Rcpp::RawVector Cpp_charToRaw(const std::string& s) {
  Rcpp::RawVector res(s.begin(), s.end());
  return res;
}

// [[Rcpp::export]]
SEXP C_charToRaw(SEXP x) {
  if (!Rf_isString(x) || LENGTH(x) == 0) {
    Rf_error("argument must be a character vector of length 1");
  }
  if (LENGTH(x) > 1) {
    Rf_warning("argument should be a character vector of length 1\nall but the first element will be ignored");
  }
  int nc = LENGTH(STRING_ELT(x, 0));
  SEXP ans = Rf_allocVector(RAWSXP, nc);
  if (nc) {
    memcpy(RAW(ans), CHAR(STRING_ELT(x, 0)), nc);
  }
  return ans;
}

Benchmark code:

x = "Test string. Test string"
bench::mark(
  Cpp_charToRaw(x),
  C_charToRaw(x),
  charToRaw(x),
  iterations = 100000
)

Benchmark results:

# A tibble: 3 x 13
  expression            min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result memory
  <bch:expr>       <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list> <list>
1 Cpp_charToRaw(x)   1.44µs   1.58µs   611480.        0B     30.6 99995     5    163.5ms < [24… <df[,…
2 C_charToRaw(x)     1.38µs   1.49µs   648339.        0B     38.9 99994     6    154.2ms < [24… <df[,…
3 charToRaw(x)     277.88ns 329.81ns  2747742.        0B     27.5 99999     1     36.4ms < [24… <df[,…
# … with 2 more variables: time <list>, gc <list>

Question: Why builtin charToRaw so fast?

Build log:

Generated extern "C" functions 
--------------------------------------------------------


#include <Rcpp.h>
// Cpp_charToRaw
Rcpp::RawVector Cpp_charToRaw(const std::string& s);
RcppExport SEXP sourceCpp_1_Cpp_charToRaw(SEXP sSEXP) {
BEGIN_RCPP
    Rcpp::RObject rcpp_result_gen;
    Rcpp::RNGScope rcpp_rngScope_gen;
    Rcpp::traits::input_parameter< const std::string& >::type s(sSEXP);
    rcpp_result_gen = Rcpp::wrap(Cpp_charToRaw(s));
    return rcpp_result_gen;
END_RCPP
}
// C_charToRaw
SEXP C_charToRaw(SEXP x);
RcppExport SEXP sourceCpp_1_C_charToRaw(SEXP xSEXP) {
BEGIN_RCPP
    Rcpp::RObject rcpp_result_gen;
    Rcpp::RNGScope rcpp_rngScope_gen;
    Rcpp::traits::input_parameter< SEXP >::type x(xSEXP);
    rcpp_result_gen = Rcpp::wrap(C_charToRaw(x));
    return rcpp_result_gen;
END_RCPP
}

Generated R functions 
-------------------------------------------------------

`.sourceCpp_1_DLLInfo` <- dyn.load('/tmp/RtmpIEEIRN/sourceCpp-x86_64-pc-linux-gnu-1.0.2/sourcecpp_11646c07fffb/sourceCpp_5.so')

Cpp_charToRaw <- Rcpp:::sourceCppFunction(function(s) {}, FALSE, `.sourceCpp_1_DLLInfo`, 'sourceCpp_1_Cpp_charToRaw')
C_charToRaw <- Rcpp:::sourceCppFunction(function(x) {}, FALSE, `.sourceCpp_1_DLLInfo`, 'sourceCpp_1_C_charToRaw')

rm(`.sourceCpp_1_DLLInfo`)

Building shared library
--------------------------------------------------------

DIR: /tmp/RtmpIEEIRN/sourceCpp-x86_64-pc-linux-gnu-1.0.2/sourcecpp_11646c07fffb

/usr/lib64/R/bin/R CMD SHLIB -o 'sourceCpp_5.so' --preclean  'test.cpp'  
g++ -I"/usr/include/R/" -DNDEBUG   -I"/home/xxx/R/x86_64-pc-linux-gnu-library/3.6/Rcpp/include" -I"/home/xxx/projects/R/packages/RestRserve/tmp" -I"/home/xxx/projects/R/packages/RestRserve/tmp/../inst/include" -D_FORTIFY_SOURCE=2  -fpic  -march=x86-64 -mtune=generic -O2 -pipe -fno-plt  -c test.cpp -o test.o
g++ -shared -L/usr/lib64/R/lib -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -o sourceCpp_5.so test.o -L/usr/lib64/R/lib -lR

Update

Based on the answer and comments Rcpp::RNGScope was disables with [[Rcpp::export(rng = false)]].

Also little improved Cpp_rawToChar function:

// [[Rcpp::export(rng = false)]]
Rcpp::RawVector Cpp_charToRaw2(const char* s) {
  Rcpp::RawVector res(s, s + std::strlen(s));
  return res;
}

Updated benchmarks:

# A tibble: 4 x 13
  expression          min median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result memory
  <bch:expr>        <bch> <bch:>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list> <list>
1 Cpp_charToRaw(x)  448ns  506ns  1789684.        0B     35.8 99998     2     55.9ms < [24… <df[,…
2 Cpp_charToRaw2(x) 361ns  412ns  2180744.        0B     43.6 99998     2     45.9ms < [24… <df[,…
3 C_charToRaw(x)    331ns  369ns  2428416.        0B     24.3 99999     1     41.2ms < [24… <df[,…
4 charToRaw(x)      274ns  311ns  2930855.        0B     58.6 99998     2     34.1ms < [24… <df[,…
# … with 2 more variables: time <list>, gc <list>
like image 849
Artem Klevtsov Avatar asked Jan 01 '23 17:01

Artem Klevtsov


1 Answers

The overhead almost certainly comes from the Rcpp wrapper around your functions. As you can see from the generated code, this wrapper sets up an RNG scope, which involves copying a large-ish vector of numbers (in your case this is actually unnecessary; use [[Rcpp::export(rng = false)]] to disable it). In the case of your Cpp_charToRaw, the wrapper additionally needs to copy the R vector into a std::string, since this conversion cannot happen in-place (it could with std::string_view).

You can test this Rcpp overhead by benchmarking an empty Rcpp function:

// [[Rcpp::export]]
SEXP do_nothing(SEXP x) {
    return x;
}
like image 83
Konrad Rudolph Avatar answered Jan 08 '23 02:01

Konrad Rudolph