I'm doing R code optimization with Rcpp and parallel computing on Windows. I have a trouble calling Rcpp function in parLapply. The example is following
Rcpp code (test.cpp)
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector payoff( double strike, NumericVector data) {
return pmax(data - strike, 0);
}
R code
library(parallel)
library(Rcpp)
sourceCpp("test.cpp")
strike_list <- as.list(seq(10, 100, by = 5))
data <- runif(10000) * 50
# One core version
strike_payoff <- lapply(strike_list, payoff, data)
# Multiple cores version
numWorkers <- detectCores()
cl <- makeCluster(numWorkers, type = "PSOCK")
clusterExport(cl = cl,varlist = "payoff")
strike_payoff <- parLapply(cl, strike_list, payoff, data)
Error for parallel version
Error in checkForRemoteErrors(val) :
8 nodes produced errors; first error: NULL value passed as symbol address
I know that this is a Windows issue, as mclapply works well on Linux, but I don't have as powerful Linux machine as with Windows.
Any ideas how to fix it?
You need to run the sourceCpp()
call in each spawned process, or else get them your code. Right now the main process has the function, the spawned workers do not.
Easiest way is by building a package and have it loaded by each worker process.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With