Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating a R data frame in C/C++

Tags:

c++

r

I'm trying to make a data.frame to pass into R code from C/C++ code (I'm not using Rcpp, I want to use the R c-api from first principles for reasons not relevant here)

I know how to make a vector of doubles and load with data (just dummy data here for my example)

SEXP col1;
Rf_protect(col1= Rf_allocVector(REALSXP, 10);
for (size_t j = 0; j < 10; j++)
{
     double dval = static_cast<double(j*3.14);
     REAL(col1)[static_cast<int>(j)] = dval;
}
SEXP nameSym = Rf_install(const_cast<char*>("myColumn1"));
Rf_defineVar(nameSym, col1, _R_GlobalEnv);

And this works such that inside R I can do

Y <- mean(myColumn1)

but what I really want to do is have a dataframe with more than 1 column in

SEXP col1;
SEXP col2;
Rf_protect(col1= Rf_allocVector(REALSXP, 10);
Rf_protect(col2 = Rf_allocVector(REALSXP, 10);
    for (size_t j = 0; j < 10; j++)
    {
         double dval1 = static_cast<double(j*3.14);
         double dval2 = static_cast<double(j*42.0);
         REAL(col1)[static_cast<int>(j)] = dval1;
         REAL(col2)[static_cast<int>(j)] = dval2;
    }
    SEXP nameSym1 = Rf_install(const_cast<char*>("myColumn1"));
    SEXP nameSym2 = Rf_install(const_cast<char*>("myColumn2"));
    Rf_defineVar(nameSym1, col1, _R_GlobalEnv);
    Rf_defineVar(nameSym2, col2, _R_GlobalEnv);

    .... data.frame?

Ideally I'd like to put myColumn1 and myColumn2 into a data,frame ("myData") so I can do

Y <- mean(myData$myColumn1)
Z <- min(myData$myColumn2)

Does anyone know how to construct a data frame?

Any pointers would be helpful

like image 898
MyDeveloperDay Avatar asked Dec 24 '22 06:12

MyDeveloperDay


1 Answers

You need to allocate a VECSXP and set its "names"/"class"/"row.names" attributes:

ff = inline::cfunction(sig = c(), body = '
    SEXP col1;
    SEXP col2;
    Rf_protect(col1= Rf_allocVector(REALSXP, 10));
    Rf_protect(col2 = Rf_allocVector(REALSXP, 10));
    for (size_t j = 0; j < 10; j++) {
        double dval1 = static_cast<double>(j*3.14);
        double dval2 = static_cast<double>(j*42.0);
        REAL(col1)[static_cast<int>(j)] = dval1;
        REAL(col2)[static_cast<int>(j)] = dval2;
    }
    //SEXP nameSym1 = Rf_install(const_cast<char*>("myColumn1"));
    //SEXP nameSym2 = Rf_install(const_cast<char*>("myColumn2"));
    //Rf_defineVar(nameSym1, col1, R_GlobalEnv);
    //Rf_defineVar(nameSym2, col2, R_GlobalEnv);
    SEXP ans = PROTECT(allocVector(VECSXP, 2)),
         nms = PROTECT(allocVector(STRSXP, 2)),
         rnms = PROTECT(allocVector(INTSXP, 2)); 

    SET_STRING_ELT(nms, 0, mkChar("myColumn1"));
    SET_STRING_ELT(nms, 1, mkChar("myColumn2"));

    SET_VECTOR_ELT(ans, 0, col1);
    SET_VECTOR_ELT(ans, 1, col2);

    INTEGER(rnms)[0] = NA_INTEGER;
    INTEGER(rnms)[1] = -10;

    setAttrib(ans, R_ClassSymbol, ScalarString(mkChar("data.frame")));
    setAttrib(ans, R_RowNamesSymbol, rnms);
    setAttrib(ans, R_NamesSymbol, nms);

    UNPROTECT(5);
    return(ans);
')

str(ff())
#'data.frame':   10 obs. of  2 variables:
# $ myColumn1: num  0 3.14 6.28 9.42 12.56 ...
# $ myColumn2: num  0 42 84 126 168 210 252 294 336 378
like image 102
alexis_laz Avatar answered Jan 06 '23 21:01

alexis_laz