Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rcpp select/subset NumericMatrix column by a NumericVector

Tags:

c++

r

rcpp

I can select all the rows of a matrix and a range of columns of a matrix as follows:

library(Rcpp)
cppFunction('
NumericMatrix subset(NumericMatrix x){
  return x(_, Range(0, 1));
}
')

However, I would like to select columns based on a NumericVector y which, for instance, could be something like c(0, 1, 0, 0, 1). I tried this:

library(Rcpp)
cppFunction('
NumericMatrix subset(NumericMatrix x, NumericVector y){
  return x(_, y);
}
')

but it doesn't compile. How do I do it?

like image 653
Euler_Salter Avatar asked Jan 25 '23 00:01

Euler_Salter


1 Answers

Alas, Rcpp doesn't have great support for non-contiguous views or selecting in a single statement only columns 1 and 4. As you saw, selecting contiguous views or selecting all columns is accessible with Rcpp::Range(). You'll likely want to upgrade to RcppArmadillo for better control over matrix subsets.

RcppArmadillo subset examples

#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]

// [[Rcpp::export]]
arma::mat matrix_subset_idx(const arma::mat& x,
                            const arma::uvec& y) { 

    // y must be an integer between 0 and columns - 1
    // Allows for repeated draws from same columns.
    return x.cols( y );
}


// [[Rcpp::export]]
arma::mat matrix_subset_logical(const arma::mat& x,
                                const arma::vec& y) { 
    // Assumes that y is 0/1 coded.
    // find() retrieves the integer index when y is equivalent 1. 
    return x.cols( arma::find(y == 1) );
}

Test

# Sample data
x = matrix(1:15, ncol = 5)
x
#      [,1] [,2] [,3] [,4] [,5]
# [1,]    1    4    7   10   13
# [2,]    2    5    8   11   14
# [3,]    3    6    9   12   15

# Subset only when 1 (TRUE) is found:
matrix_subset_logical(x, c(0, 1, 0, 0, 1))
#      [,1] [,2]
# [1,]    4   13
# [2,]    5   14
# [3,]    6   15

# Subset with an index representing the location
# Note: C++ indices start at 0 not 1!
matrix_subset_idx(x, c(1, 3))
#      [,1] [,2]
# [1,]    4   13
# [2,]    5   14
# [3,]    6   15

Pure Rcpp logic

If you do not want to take on the dependency of armadillo, then the equivalent for the matrix subset in Rcpp is:

#include <Rcpp.h>

// [[Rcpp::export]]
Rcpp::NumericMatrix matrix_subset_idx_rcpp(
        Rcpp::NumericMatrix x, Rcpp::IntegerVector y) { 

    // Determine the number of observations
    int n_cols_out = y.size();

    // Create an output matrix
    Rcpp::NumericMatrix out = Rcpp::no_init(x.nrow(), n_cols_out);

    // Loop through each column and copy the data. 
    for(unsigned int z = 0; z < n_cols_out; ++z) {
        out(Rcpp::_, z) = x(Rcpp::_, y[z]);
    }

    return out;
}
like image 196
coatless Avatar answered Feb 13 '23 04:02

coatless