I am writing some code in C to be called dynamically from R.
This code generates a path of a random Poisson process up to a desired time T. So at every call to my C function, the length of the returned vector will be different depending on the random numbers generated.
What R data structure will I have to create? a LISTSXP? another one?
How can I create it, and how can I append to it? And especially how can I return it back to R?
Thanks for help.
If you're open to switching from C to C++, you get added layer of Rcpp for free. Here is an example from the page of the (still rather incomple) RcppExample package:
#include <RcppClassic.h>
#include <cmath>
RcppExport SEXP newRcppVectorExample(SEXP vector) {
BEGIN_RCPP
Rcpp::NumericVector orig(vector); // keep a copy
Rcpp::NumericVector vec(orig.size()); // create vector same size
// we could query size via
// int n = vec.size();
// and loop over the vector, but using the STL is so much nicer
// so we use a STL transform() algorithm on each element
std::transform(orig.begin(), orig.end(), vec.begin(), ::sqrt);
return Rcpp::List::create(Rcpp::Named( "result" ) = vec,
Rcpp::Named( "original" ) = orig) ;
END_RCPP
}
As you see, no explicit memory allocation, freeing, PROTECT/UNPROTECT
etc, and you get a first class R list object back.
There are lots more examples, included in other SO questions such as this one.
Edit: And you didn't really say what you paths would do, but as a simple illustration, here is C++ code using the Rcpp additions cumsum()
and rpois()
which behave just like they do in R:
R> library(inline)
R>
R> fun <- cxxfunction(signature(ns="integer", lambdas="numeric"),
+ plugin="Rcpp",
+ body='
+ int n = Rcpp::as<int>(ns);
+ double lambda = Rcpp::as<double>(lambdas);
+
+ Rcpp::RNGScope tmp; // make sure RNG behaves
+
+ Rcpp::NumericVector vec = cumsum( rpois( n, lambda ) );
+
+ return vec;
+ ')
R> set.seed(42)
R> fun(3, 0.3)
[1] 1 2 2
R> fun(4, 0.4)
[1] 1 1 1 2
And as a proof, back in R, if we set the seed, we can generate exactly the same numbers:
R> set.seed(42)
R> cumsum(rpois(3, 0.3))
[1] 1 2 2
R> cumsum(rpois(4, 0.4))
[1] 1 1 1 2
R>
It's really up to you what you want to use as temporary structure, because in the end you'll have to allocate a vector for result anyway. So whatever you'll be using is not what you'll return. There are several possibilities:
Calloc
+ Realloc
+ Free
which allows you to expand the temporary memory as needed. Once you have the full set, you allocate the result vector and return it.SETLENGTH
before you return it. There are issues with this, though, because the result will remain over-allocated until duplicated later on.Each of them has drawbacks and benefits, so it really depends on your application to pick the one best suited for you.
Edit: added an example of using the pairlists approach. I would still recommend the Realloc
approach since it's much easier, but nonetheless:
#define BLOCK_SIZE xxx /* some reasonable size for increments - could be adaptive, too */
SEXP block; /* last vector block */
SEXP root = PROTECT(list1(block = allocVector(REALSXP, BLOCK_SIZE)));
SEXP tail = root;
double *values = REAL(block);
int count = 0, total = 0;
do { /* your code to generate values - if you want to add one
first try to add it to the existing block, otherwise allocate new one */
if (count == BLOCK_SIZE) { /* add a new block when needed */
tail = SETCDR(tail, list1(block = allocVector(REALSXP, BLOCK_SIZE)));
values = REAL(block);
total += count;
count = 0;
}
values[count++] = next_value;
} while (...);
total += count;
/* when done, we need to create the result vector */
{
SEXP res = allocVector(REALSXP, total);
double *res_values = REAL(res);
while (root != R_NilValue) {
int size = (CDR(root) == R_NilValue) ? count : BLOCK_SIZE;
memcpy(res_values, REAL(CAR(root)), sizeof(double) * size);
res_values += size;
root = CDR(root);
}
UNPROTECT(1);
return res;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With