Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Returning an R function from rcpp

Is there a way in Rcpp to return an R function with some pre-computed values that are only computed on the first function call? Consider the following R code:

1: func_generator<-function(X) {
2:  X_tot<-sum(X)
3:  function(b_vec) { (X_tot*b_vec) }
4: }
5: myfunc<-func_generator(c(3,4,5))
6: myfunc(1:2)
7: myfunc(5:6)
8: myfunc2<-func_generator(c(10,11,12,13))
...

Can this be programmed in Rcpp? In practice, assume that something more computationally intensive is done in place of line 2.

To add context, given vector X and scalar b, there is some likelihood function f(b|X), which can be reexpressed as f(b,s(X)) for some sufficient statistic s(X) that is a function only of X, and which involves some computation. This is in a computationally intensive computer experiment, with many vectors X (many likelihoods), and many separate calls to f(bvec|X) for each likelihood, so I'd rather compute s(X) once (for each likelihood) and save it in some fashion rather than re-computing it many times. I've started by simply programming f(bvec,X) to evaluate f(b|X) at the points bvec=(b_1,...,b_n), but this has extra overhead since I call this function several times and it computes s(X) on each run. I'd like to just compute s(X) once.

Any suggestions to accomplish this task efficiently in Rcpp would be appreciated (whether via returning a function; or via storing intermediate calculations in some other fashion).

like image 434
Devin Apps Avatar asked Sep 16 '18 01:09

Devin Apps


People also ask

How to call r function in Rcpp?

Using the Function class, you can call R functions from Rcpp. The argument given to the R function is determined based on position and name. Use Named() or _[] to pass a value to an argument by specifying argument name. Name() can be used in two ways: Named("argument_name", value) or Named("argument_name") = value .

How to compile c++ in R?

To compile the C++ code, use sourceCpp("path/to/file. cpp") . This will create the matching R functions and add them to your current session. Note that these functions can not be saved in a .

What is RCPP?

The Regional Conservation Partnership Program (RCPP), established through the 2014 Farm Bill, allows conservation partners with similar missions to collaborate with USDA's Natural Resources Conservation Service (NRCS) to further the conservation, restoration, and sustainable use of soil, water, and wildlife habitat in ...

What is RCPP sugar?

Rcpp sugar brings a higher-level of abstraction to C++ code written using the Rcpp API. Rcpp sugar is based on expression templates (Abrahams and Gurtovoy, 2004; Vandevoorde and Josuttis, 2003) and provides some 'syntactic sugar' facilities directly in Rcpp.


1 Answers

One simple way to store intermediate results would be a static variable at function level:

// [[Rcpp::plugins(cpp11)]]
#include <thread>
#include <chrono>
#include <Rcpp.h>

// [[Rcpp::export]]
Rcpp::NumericVector foo(Rcpp::NumericVector X, Rcpp::NumericVector b, bool useCache = true) {
  static double cache;
  static bool initialized{false};
  if (!(useCache && initialized)) {
    // sleep to simulate actual work
    std::this_thread::sleep_for (std::chrono::seconds(1));
    cache = Rcpp::sum(X);
    initialized = true;
  }
  return cache * b;
}

/*** R
X <- 1:10
b <- 10:20

system.time(r1 <- foo(X, b))
system.time(r2 <- foo(X, b))
all.equal(r1, r2)
system.time(r3 <- foo(X, b, FALSE))
all.equal(r1, r3)
*/

Output:

> system.time(r1 <- foo(X, b))
   user  system elapsed 
      0       0       1 

> system.time(r2 <- foo(X, b))
   user  system elapsed 
  0.002   0.000   0.002 

> all.equal(r1, r2)
[1] TRUE

> system.time(r3 <- foo(X, b, FALSE))
   user  system elapsed 
      0       0       1 

> all.equal(r1, r3)
[1] TRUE

When the cache is used in the second function call, the result is computed almost instantaneously.

This approach is efficient if you can loop over the different b within a loop over the different X. If this restriction does not work for you, then you could use the memoise package at the R level to efficiently store the output of your expensive function for arbitrary input:

// [[Rcpp::plugins(cpp11)]]
#include <thread>
#include <chrono>
#include <Rcpp.h>

// [[Rcpp::export]]
Rcpp::NumericVector foo(double total, Rcpp::NumericVector b) {
  return total * b;
}

// [[Rcpp::export]]
double bar(Rcpp::NumericVector X) {
  // sleep to simulate actual work
  std::this_thread::sleep_for (std::chrono::seconds(1));
  return Rcpp::sum(X);
}


/*** R
X1 <- 1:10
b1 <- 10:20
X2 <- 10:1
b2 <- 20:10

library(memoise)
bar2 <- memoise(bar)

system.time(r11 <- foo(bar2(X1), b1))
system.time(r21 <- foo(bar2(X2), b2))
system.time(r12 <- foo(bar2(X1), b1))
system.time(r22 <- foo(bar2(X2), b2))
all.equal(r11, r12)
all.equal(r21, r22)
*/

Output:

> system.time(r11 <- foo(bar2(X1), b1))
   user  system elapsed 
  0.001   0.000   1.001 

> system.time(r21 <- foo(bar2(X2), b2))
   user  system elapsed 
  0.033   0.000   1.033 

> system.time(r12 <- foo(bar2(X1), b1))
   user  system elapsed 
      0       0       0 

> system.time(r22 <- foo(bar2(X2), b2))
   user  system elapsed 
      0       0       0 

> all.equal(r11, r12)
[1] TRUE

> all.equal(r21, r22)
[1] TRUE

As an alternative you could also use these two functions as building blocks for your function generator:

func_generator <- function(X) {
  X_tot <- bar(X)
  function(b_vec) { foo(X_tot, b_vec) }
}
myfunc <- func_generator(c(3,4,5))
myfunc2 <- func_generator(c(10,11,12,13))
myfunc(1:2)
myfunc(5:6)
myfunc2(1:2)
myfunc2(5:6)

So keep the numerical expensive work in C++, but keep it simple. The functional aspects can then be added using R.

like image 50
Ralf Stubner Avatar answered Oct 20 '22 10:10

Ralf Stubner