Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is Eigens mean() method so much faster than sum()?

Tags:

c++

eigen

This is a rather theoretical question, but I'm quite interested in it and would be glad if someone has some expert knowledge on this which he or she is willing to share.

I have a matrix of floats with 2000 rows and 600 cols and want to subtract the mean of the columns from each row. I have tested the following two lines and compared their runtime:

MatrixXf centered = data.rowwise() - (data.colwise().sum() / data.cols());
MatrixXf centered = data.rowwise() - data.colwise().mean();

I thought, mean() would not do something different from dividing the sum of each column by the number of rows, but while the execution of the first line takes 12.3 seconds on my computer, the second line finishes in 0.09 seconds.

I'm using Eigen version 3.2.6, which currently is the latest version, and my matrices are stored in row-major order.

Does someone know something about the internals of Eigen which could explain this huge performance difference?


Edit: I should add that data in the code above actually is of type Eigen::Map< Eigen::MatrixXf<Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor> > and maps Eigen's functionality to a raw buffer.


Edit 2: As suggested by GuyGreer, I'll provide some sample code to reproduce my findings:

#include <iostream>
#include <chrono>
#include <Eigen/Core>
using namespace std;
using namespace std::chrono;
using namespace Eigen;

int main(int argc, char * argv[])
{
    MatrixXf data(10000, 1000), centered;
    data.setRandom();
    auto start = high_resolution_clock::now();
    if (argc > 1)
        centered = data.rowwise() - data.colwise().mean();
    else
        centered = data.rowwise() - (data.colwise().sum() / data.rows());
    auto stop = high_resolution_clock::now();
    cout << duration_cast<milliseconds>(stop - start).count() << " ms" << endl;
    return 0;
}

Compile with:

g++ -O3 -std=c++11 -o test test.cc

Running the resulting program without arguments, so that is uses sum(), takes 126 seconds on my machine, while running test 1 using mean() only takes 0.03 seconds!


Edit 3: As it turned out (see comments), it is not sum() which takes so long, but the division of the resulting vector by the number of rows. So the new question is: Why does Eigen take more than 2 minutes to divide a vector with 1000 columns by a single scalar?

like image 729
Callidior Avatar asked Oct 27 '15 19:10

Callidior


1 Answers

Somehow, both the partial reduction (sum) and division are recomputed every time because some crucial information about the evaluation cost of the partial reduction are wrongly lost by operator/... Explicitly evaluating the mean fixes the issue:

centered = data.rowwise() - (data.colwise().sum() / data.cols()).eval();

Of course, this evaluation should be done by Eigen for you, as fixed by the changeset 42ab43a. This fix will be part of the next 3.2.7 and 3.3 releases.

like image 56
ggael Avatar answered Oct 07 '22 22:10

ggael