Suppose that A is a complex matrix. I am interested in computing the product A%*%Conj(t(A)) in R efficiently. As far as I understand, using C++ would speed up things significantly, so that is what I am trying to do.
I have the following code for real matrices that I can use in R.
library(Rcpp);
library(inline);
library(RcppEigen);
crossprodCpp <- '
using Eigen::Map;
using Eigen::MatrixXd;
using Eigen::Lower;
const Map<MatrixXd> A(as<Map<MatrixXd> >(AA));
const int m(A.rows());
MatrixXd AAt(MatrixXd(m, m).setZero().selfadjointView<Lower>().rankUpdate(A));
return wrap(AAt);
'
fcprd <- cxxfunction(signature(AA = "matrix"), crossprodCpp, "RcppEigen")
A<-matrix(rnorm(100^2),100)
all.equal(fcprd(A),tcrossprod(A))
fcprd(A) runs much faster on my laptop than tcrossprod(A). This is what I get for A<-matrix(rnorm(1000^2),1000):
microbenchmark::microbenchmark('tcrossprod(A)'=tcrossprod(A),'A%*%t(A)'=A%*%t(A),fcprd=fcprd(A))
Unit: milliseconds
expr min lq mean median uq max neval
tcrossprod(A) 428.06452 435.9700 468.9323 448.8168 504.2628 618.7681 100
A%*%t(A) 722.24053 736.6197 775.4814 767.7668 809.8356 903.8592 100
fcprd 95.04678 100.0733 111.5021 103.6616 107.2551 197.4479 100
However, this code only works for matrices with double precision entries. How could I modify this code so that it works for complex matrices?
I have a very limited knowledge of programming, but I am trying to learn. Any help is much appreciated!
The Eigen library supports also complex entries via Eigen::MatrixXcd. So in principle it should work if you replace MatrixXd with MatrixXcd. However, this does not compile probably because there is no as-function for complex matrices using Map (c.f. https://github.com/RcppCore/RcppEigen/blob/master/inst/unitTests/runit.RcppEigen.R#L205). The as-function are needed to convert between R data types and C++/Eigen data types (c.f. http://dirk.eddelbuettel.com/code/rcpp/Rcpp-extending.pdf). If you do not use Map, then you can use this:
crossprodCpp <- '
using Eigen::MatrixXcd;
using Eigen::Lower;
const MatrixXcd A(as<MatrixXcd>(AA));
const int m(A.rows());
MatrixXcd AAt(MatrixXcd(m, m).setZero().selfadjointView<Lower>().rankUpdate(A));
return wrap(AAt);
'
fcprd <- inline::cxxfunction(signature(AA = "matrix"), crossprodCpp, "RcppEigen")
N <- 100
A <- matrix(complex(real = rnorm(N), imaginary = rnorm(N)), N)
all.equal(fcprd(A), A %*% Conj(t(A)))
However, this is slower than the base R version in my tests:
N <- 1000
A <- matrix(complex(real = rnorm(N * N), imaginary = rnorm(N * N)), N)
all.equal(fcprd(A), A %*% Conj(t(A)))
#> [1] TRUE
microbenchmark::microbenchmark(base = A %*% Conj(t(A)), eigen = fcprd(A))
#> Unit: milliseconds
#> expr min lq mean median uq max neval
#> base 111.6512 124.4490 145.7583 140.9199 160.3420 241.8986 100
#> eigen 453.6702 501.5419 535.0192 537.2925 564.8746 628.4999 100
Note that matrix multiplication in R is done via BLAS. However, the default BLAS implementation used by R is not very fast. One way to improve R's performance is to use an optimized BLAS library, c.f. https://csgillespie.github.io/efficientR/set-up.html#blas-and-alternative-r-interpreters.
Alternatively you can use the BLAS function zherk if you have a full BLAS available. Very rough:
dyn.load("/usr/lib/libblas.so")
zherk <- function(a, uplo = 'u', trans = 'n') {
n <- nrow(a)
k <- ncol(a)
c <- matrix(complex(real = 0, imaginary = 0), nrow = n, ncol = n)
z <- .Fortran("zherk",
uplo = as.character(uplo),
trans = as.character(trans),
n = as.integer(n),
k = as.integer(k),
alpha = as.double(1),
a = as.complex(a),
lda = as.integer(n),
beta = as.double(0),
c = as.complex(c),
ldc = as.integer(n))
matrix(z$c, nrow = n, ncol = n)
}
N <- 2
A <- matrix(complex(real = rnorm(N * N), imaginary = rnorm(N * N)), N)
zherk(A, uplo = "l") - A %*% Conj(t(A))
Note that this fills only the upper (or lower) triangular part but is quite fast:
microbenchmark::microbenchmark(base = A %*% Conj(t(A)), blas = zherk(A))
#> Unit: milliseconds
#> expr min lq mean median uq max neval
#> base 112.5588 117.12531 146.10026 138.37565 167.6811 282.3564 100
#> blas 66.9541 70.12438 91.44617 82.74522 108.4979 188.3728 100
Here is a way to bind an Eigen::Map<Eigen::MatrixXcd> object in Rcpp. The solution works in a R package setup, but I'm not sure about an easy way to put it together using the inline library.
First, you need to provide the following specialization in your inst/include/mylib.h such that this header get included in the RcppExports.cpp:
#include <complex>
#include <Eigen/Core>
#include <Eigen/Dense>
#include <Rcpp.h>
namespace Rcpp {
namespace traits {
template<>
class Exporter<Eigen::Map<Eigen::Matrix<std::complex<double>, Eigen::Dynamic, Eigen::Dynamic> > > {
using OUT = typename Eigen::Map<Eigen::Matrix<std::complex<double>, Eigen::Dynamic, Eigen::Dynamic> >;
const static int RTYPE = ::Rcpp::traits::r_sexptype_traits<std::complex<double>>::rtype;
Rcpp::Vector<RTYPE> vec;
int d_ncol, d_nrow;
public:
Exporter(SEXP x)
: vec(x), d_ncol(1)
, d_nrow(Rf_xlength(x)) {
if (TYPEOF(x) != RTYPE)
throw std::invalid_argument("Wrong R type for mapped matrix");
if (::Rf_isMatrix(x)) {
int* dims = INTEGER(::Rf_getAttrib(x, R_DimSymbol));
d_nrow = dims[0];
d_ncol = dims[1];
}
}
OUT get() { return OUT(reinterpret_cast<std::complex<double>*>(vec.begin()), d_nrow, d_ncol); }
};
}}
The only difference with the unspecialized Exporter available in RcppEigenWrap.h being the reinterpret_cast on the last line. Both std::complex and Rcomplex having C99 complex compatible types, they are supposed to have identical memory layouts regardless of the implementation.
Wrapping it up, you can now create your function as:
// [[Rcpp::export]]
Eigen::MatrixXd selfadj_mult(const Eigen::Map<Eigen::MatrixXcd>& mat) {
Eigen::MatrixXd result = (mat * mat.adjoint()).real();
return result;
}
and then invoke the function in R as:
library(mylib)
library(microbenchmark)
N <- 1000
A <- matrix(complex(real = rnorm(N * N), imaginary = rnorm(N * N)), N)
microbenchmark::microbenchmark(
base = A %*% Conj(t(A))
, eigen = mylib::selfadj_mult(A)
, times = 100L
)
the code is compiled on centos7/gcc83 with -O3 -DNDEBUG -flto -march=generic. R has been build from source with the exact same compiler/flags (using the default BLAS binding). Results are:
Unit: seconds
expr min lq mean median uq max neval
base 2.9030320 2.9045865 2.9097162 2.9053835 2.9093232 2.9614318 100
eigen 1.1978697 1.2004888 1.2134219 1.2031046 1.2057647 1.3035751 100
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With