Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast function to add vector elements by their names

I wrote this R function that, given any number of vectors (...) combines them by summing the respective element values ​​based on their names.

add_vectors <- function(...) {
  a <- list(...)
  nms <- sort(unique(unlist(lapply(a, names))))
  out <- numeric(length(nms))
  names(out) <- nms
  for (v in a) out[names(v)] <- out[names(v)] + v

  out
}

Example:

v1 <- c(a=2,b=3,e=4)
v2 <- c(b=1,c=6,d=0,a=4)
add_vectors(v1, v2)
#
a b c d e 
6 4 6 0 4

I'm trying to write an equivalent function which is much faster.

Unfortunately at the moment I have no idea how to achieve this in R so I thought to Rcpp. But, in order to convert in Rcpp this function I miss some concepts:

  1. How to manage the ... parameter. With a parameter of List type in Rcpp ?
  2. How to iterate the vectors in the ... parameter.
  3. How to access (and then sum) the elements of the vectors by their name (this is very trivial in R, but I cannot figure how to do in Rcpp).

So I'm looking for someone that can help me to improve the performances of this function (in R or Rcpp, or both).

Any help is appreciated, thanks.

like image 397
leodido Avatar asked Apr 02 '13 13:04

leodido


1 Answers

I would use something like this:

#include <Rcpp.h>
using namespace Rcpp; 

// [[Rcpp::export]]
NumericVector add_all(List vectors){
    RCPP_UNORDERED_MAP<std::string,double> out ; 
    int n = vectors.size() ;
    for( int i=0; i<n; i++){
        NumericVector x = vectors[i] ;
        CharacterVector names = x.attr("names") ;
        int m = x.size() ;

        for( int j=0; j<m; j++){
            String name = names[j] ;
            out[ name ] += x[j] ;   
        }
    }
    return wrap(out) ;
}

with the following wrapper:

add_vectors_cpp <- function(...){
    add_all( list(...) )
}

RCPP_UNORDERED_MAP being just a typedef to unordered_map, either in std:: or in std::tr1:: depending on your compiler, etc ...

The trick here is to create a regular list out of the ... using the classic list(...).

If you really wanted to pass down directly ... in C++ and deal with it internally, you would have to use the .External interface. This is very rarely use, so Rcpp attributes don't support the .External interface.

With .External, it would look like this (untested):

SEXP add_vectors(SEXP args){
    RCPP_UNORDERED_MAP<std::string,double> out ; 
    args = CDR(args) ;
    while( args != R_NilValue ){
        NumericVector x = CAR(args) ;

        CharacterVector names = x.attr("names") ;
        int m = x.size() ;

        for( int j=0; j<m; j++){
            String name = names[j] ;
            out[ name ] += x[j] ;   
        }        
        args = CDR(args) ;
    }   
    return wrap(out) ;
}
like image 65
Romain Francois Avatar answered Sep 28 '22 14:09

Romain Francois