Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

S4 object with a pointer to a C struct

Tags:

c

r

s4

rcpp

I have a third-party C library I am using to write an R extension. I am required to create a few structs defined in the library (and initialize them) I need to maintain them as part of an S4 object (think of these structs as defining to state of a computation, to destroy them would be to destroy all remaining computation and the results of all that has been already computed). I am thinking of creating a S4 object to hold pointers these structs as void* pointers but it is not at all clear how to do so, what would be the type of the slot?

like image 337
pbhowmick Avatar asked Dec 18 '16 16:12

pbhowmick


1 Answers

As pointed out by @hrbrmstr, you can use the externalptr type to keep such objects "alive", which is touched on in this section of Writing R Extensions, although I don't see any reason why you will need to store anything as void*. If you don't have any issue with using a little C++, the Rcpp class XPtr can eliminate a fair amount of the boilerplate involved with managing EXTPTRSXPs. As an example, assume the following simplified example represents your third party library's API:

#include <Rcpp.h>
#include <stdlib.h>

typedef struct {
    unsigned int count;
    double total;
} CStruct;

CStruct* init_CStruct() {
    return (CStruct*)::malloc(sizeof(CStruct));
}

void free_CStruct(CStruct* ptr) {
    ::free(ptr);
    ::printf("free_CStruct called.\n");
}

typedef Rcpp::XPtr<CStruct, Rcpp::PreserveStorage, free_CStruct> xptr_t;

When working with pointers created via new it is generally sufficient to use Rcpp::XPtr<SomeClass>, because the default finalizer simply calls delete on the held object. However, since you are dealing with a C API, we have to supply the (default) template parameter Rcpp::PreserveStorage, and more importantly, the appropriate finalizer (free_CStruct in this example) so that the XPtr does not call delete on memory allocated via malloc, etc., when the corresponding R object is garbage collected.

Continuing with the example, assume you write the following functions to interact with your CStruct:

// [[Rcpp::export]]
xptr_t MakeCStruct() {
    CStruct* ptr = init_CStruct();
    ptr->count = 0;
    ptr->total = 0;

    return xptr_t(ptr, true);
}

// [[Rcpp::export]]
void UpdateCStruct(xptr_t ptr, SEXP x) {
    if (TYPEOF(x) == REALSXP) {
        R_xlen_t i = 0, sz = XLENGTH(x);
        for ( ; i < sz; i++) {
            if (!ISNA(REAL(x)[i])) {
                ptr->count++;
                ptr->total += REAL(x)[i];
            }
        }
        return;
    }

    if (TYPEOF(x) == INTSXP) {
        R_xlen_t i = 0, sz = XLENGTH(x);
        for ( ; i < sz; i++) {
            if (!ISNA(INTEGER(x)[i])) {
                ptr->count++;
                ptr->total += INTEGER(x)[i];
            }
        }
        return;
    }

    Rf_warning("Invalid SEXPTYPE.\n");
}

// [[Rcpp::export]]
void SummarizeCStruct(xptr_t ptr) {
    ::printf(
        "count: %d\ntotal: %f\naverage: %f\n",
        ptr->count, ptr->total,
        ptr->count > 0 ? ptr->total / ptr->count : 0
    );
}

// [[Rcpp::export]]
int GetCStructCount(xptr_t ptr) {
    return ptr->count;
}

// [[Rcpp::export]]
double GetCStructTotal(xptr_t ptr) {
    return ptr->total;
}

// [[Rcpp::export]]
void ResetCStruct(xptr_t ptr) {
    ptr->count = 0;
    ptr->total = 0.0;
}

At this point, you have done enough to start handling CStructs from R:

  • ptr <- MakeCStruct() will initialize a CStruct and store it as an externalptr in R
  • UpdateCStruct(ptr, x) will modify the data stored in the CStruct, SummarizeCStruct(ptr) will print a summary, etc.
  • rm(ptr); gc() will remove the ptr object and force the garbage collector to run, thus calling free_CStruct(ptr) and destroying the object on the C side of things as well

You mentioned the use of S4 classes, which is one option for containing all of these functions in a single place. Here's one possibility:

setClass(
    "CStruct",
    slots = c(
        ptr = "externalptr",
        update = "function",
        summarize = "function",
        get_count = "function",
        get_total = "function",
        reset = "function"
    )
)

setMethod(
    "initialize",
    "CStruct",
    function(.Object) {
        .Object@ptr <- MakeCStruct()
        .Object@update <- function(x) {
            UpdateCStruct(.Object@ptr, x)
        }
        .Object@summarize <- function() {
            SummarizeCStruct(.Object@ptr)
        }
        .Object@get_count <- function() {
            GetCStructCount(.Object@ptr)
        }
        .Object@get_total <- function() {
            GetCStructTotal(.Object@ptr)
        }
        .Object@reset <- function() {
            ResetCStruct(.Object@ptr)
        }
        .Object
    }
) 

Then, we can work with the CStructs like this:

ptr <- new("CStruct")
ptr@summarize()
# count: 0
# total: 0.000000
# average: 0.000000

set.seed(123)
ptr@update(rnorm(100))
ptr@summarize()
# count: 100
# total: 9.040591
# average: 0.090406

ptr@update(rnorm(100))
ptr@summarize()
# count: 200
# total: -1.714089
# average: -0.008570

ptr@reset()
ptr@summarize()
# count: 0
# total: 0.000000
# average: 0.000000

rm(ptr); gc()
# free_CStruct called.
#          used (Mb) gc trigger (Mb) max used (Mb)
# Ncells 484713 25.9     940480 50.3   601634 32.2
# Vcells 934299  7.2    1650153 12.6  1308457 10.0

Of course, another option is to use Rcpp Modules, which more or less take care of the class definition boilerplate on the R side (using reference classes rather than S4 classes, however).

like image 138
nrussell Avatar answered Oct 04 '22 04:10

nrussell