Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Eigen vs Matlab: parallelized Matrix-Multiplication

I would like to compare the speed of Matlab in matrix multiplication with the speed of Eigen 3 on an Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz. The code including Eigen:

#include <iostream>
#include "Eigen/Dense"
#include <chrono>
#include <omp.h>


using namespace std;
using namespace Eigen;

const int dim=100;

int main()
{
    std::chrono::time_point<std::chrono::system_clock> start, end;

    int n;
    n = Eigen::nbThreads();
    cout<<n<<"\n";

    Matrix<double, Dynamic, Dynamic> m1(dim,dim);
    Matrix<double, Dynamic, Dynamic> m2(dim,dim);
    Matrix<double, Dynamic, Dynamic> m_res(dim,dim);

    start = std::chrono::system_clock::now();

    for (int i = 0 ; i <100000; ++i) {
        m1.setRandom(dim,dim);
        m2.setRandom(dim,dim);
        m_res=m1*m2;

    }

    end = std::chrono::system_clock::now();
    std::chrono::duration<double> elapsed_seconds = end-start;

    std::cout << "elapsed time: " << elapsed_seconds.count() << "s\n";

    return 0;
}

It is compiled with g++ -O3 -std=c++11 -fopenmp and executed with OMP_NUM_THREADS=8 ./prog. In Matlab I'm using

function mat_test(N,dim)
%
% N:    how many tests
% dim:  dimension of the matrices

tic
parfor i=1:N
     A = rand(dim);
     B = rand(dim);
     C = A*B;
end
toc

The result is: 9s for Matlab, 36s for Eigen. What am I doing wrong in the Eigen case? I can exclude the dynamic allocation of of the matrices. Also, only 3 threads are used instead of eight.

EDIT:

Maybe I didn't state it clearly enough: The task is to multiply 100000times double valued matrices of dim=100 which are randomly filled each time, not only once. Do it as fast as possible with Eigen. If Eigen cannot cope with Matlab, what choice would you suggest?

like image 973
pawel_winzig Avatar asked Feb 02 '15 19:02

pawel_winzig


People also ask

Is Eigen faster than Matlab?

The problem is that it is slower than Matlab. It reports about 8 seconds on average. Compiled with -O3 and no debug symbols on Ubuntu 16.04 with g++ 6.4.

Why is Matlab so fast in matrix multiplication?

Because MATLAB is a programming language at first developed for numerical linear algebra (matrix manipulations), which has libraries especially developed for matrix multiplications. And now MATLAB can also use the GPUs (Graphics processing unit) for this additionally.

Is Eigen faster than Blas?

For operations involving complex expressions, Eigen is inherently faster than any BLAS implementation because it can handle and optimize a whole operation globally -- while BLAS forces the programmer to split complex operations into small steps that match the BLAS fixed-function API, which incurs inefficiency due to ...


1 Answers

Below is a better version of your code making a fair use of Eigen. To summarize:

  • move the setRandom() outside the benchmarking loop. setRandom() calls the system rand() function which is rather slow.
  • use .noalias() to avoid the creation of a temporary (only makes sense when the right-hand-side is a product)
  • set OMP_NUM_THREADS to the true number of cores, not number of hyper-threads. (4 in your case)
  • your CPU supports AVX and FMA that are only supported by the devel branch of Eigen (will become 3.3), so use the devel branch and enable them with the -mavx and -mfma compiler options (about x3.5 speed up compared to SSE only)

The code:

#include <iostream>
#include "Eigen/Dense"
#include <chrono>

using namespace std;
using namespace Eigen;

const int dim=100;

int main()
{
    std::chrono::time_point<std::chrono::system_clock> start, end;

    int n;
    n = Eigen::nbThreads();
    cout << n << "\n";

    Matrix<double, Dynamic, Dynamic> m1(dim,dim);
    Matrix<double, Dynamic, Dynamic> m2(dim,dim);
    Matrix<double, Dynamic, Dynamic> m_res(dim,dim);

    start = std::chrono::system_clock::now();

    m1.setRandom();
    m2.setRandom();

    for (int i = 0 ; i <100000; ++i) {
      m_res.noalias() = m1 * m2;
    }

    end = std::chrono::system_clock::now();
    std::chrono::duration<double> elapsed_seconds = end-start;

    std::cout << "elapsed time: " << elapsed_seconds.count() << "s\n";

    return 0;
}
like image 117
ggael Avatar answered Nov 04 '22 10:11

ggael