Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cohabitation of RcppArmadillo and RcppParallel

Tags:

rcpp

The following toy example for parallelFor works fine (f2 is the parallelized version of f1):

// [[Rcpp::depends(RcppParallel)]]
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
#include <RcppParallel.h>
#include <iostream>
#define vector NumericVector

using namespace Rcpp;
using namespace RcppParallel;


// compute values i/i+1 for i = 0 to n-1
// [[Rcpp::export]]
vector f1(int n) {
  vector x(n);
  for(int i = 0; i < n; i++) x[i] = (double) i/ (i+1);
  return x;
}

struct mytry : public Worker {
  vector output;

  mytry(vector out) : output(out) {}

  void operator()(std::size_t begin, std::size_t end) {
    for(int i = begin; i < end; i++) output[i] = (double) i/ (i+1);
  }

};

// [[Rcpp::export]]
vector f2(int n) {
  vector x(n);
  mytry A(x);
  parallelFor(0, n, A);
  return x;
}

However, if I replace #define vector NumericVector by #define vector arma::vec this doesn’t work any more. The codes compiles and run, f1 is ok, but the vector returned by f2 just contains uninitialized values.

Many thanks in advance for any clarification.

like image 664
Elvis Avatar asked Oct 07 '14 10:10

Elvis


1 Answers

The problem here -- your class should be taking the vector by reference, rather than by value.

This is because, when using RcppParallel, you typically pre-allocate memory for an object somewhere, then fill that object -- so the parallel workers should be taking a reference to that object you want to fill.

So your worker should look like (as you noted):

struct mytry : public Worker {
  vector& output;

  mytry(vector& out) : output(out) {}

  void operator()(std::size_t begin, std::size_t end) {
    for(int i = begin; i < end; i++) output[i] = (double) i/ (i+1);
  }

Note that this works (perhaps surprisingly) for Rcpp vectors because they already are just 'proxy' objects -- just objects encapsulating a pointer to data. When you pass an Rcpp vector by value, you copy the pointer (not the underlying data!) plus some extra vector bits (e.g. the length of the vector) -- so the 'copy' retains a reference to the same data structure.

When you use a more 'classical' vector, e.g. the arma::vec or std::vector, when passing that by value to the worker you really are copying a whole new vector to the class, then filling that (temporary, copied) vector -- so the original vector never actually gets filled.

like image 182
Kevin Ushey Avatar answered Oct 06 '22 02:10

Kevin Ushey