Why does MATLAB/Octave wipe the floor with C++ in Eigenvalue Problems?

Tags:

I'm hoping that the answer to the question in the title is that I'm doing something stupid!

Here is the problem. I want to compute all the eigenvalues and eigenvectors of a real, symmetric matrix. I have implemented code in MATLAB (actually, I run it using Octave), and C++, using the GNU Scientific Library. I am providing my full code below for both implementations.

As far as I can understand, GSL comes with its own implementation of the BLAS API, (hereafter I refer to this as GSLCBLAS) and to use this library I compile using:

g++ -O3 -lgsl -lgslcblas

GSL suggests here to use an alternative BLAS library, such as the self-optimizing ATLAS library, for improved performance. I am running Ubuntu 12.04, and have installed the ATLAS packages from the Ubuntu repository. In this case, I compile using:

g++ -O3 -lgsl -lcblas -latlas -lm

For all three cases, I have performed experiments with randomly-generated matrices of sizes 100 to 1000 in steps of 100. For each size, I perform 10 eigendecompositions with different matrices, and average the time taken. The results are these:

Plot of Results

The difference in performance is ridiculous. For a matrix of size 1000, Octave performs the decomposition in under a second; GSLCBLAS and ATLAS take around 25 seconds.

I suspect that I may be using the ATLAS library incorrectly. Any explanations are welcome; thanks in advance.

Some notes on the code:

In the C++ implementation, there is no need to make the matrix symmetric, because the function only uses the lower triangular part of it.
In Octave, the line triu(A) + triu(A, 1)' enforces the matrix to be symmetric.
If you wish to compile the C++ code your own Linux machine, you also need to add the flag -lrt, because of the clock_gettime function.
Unfortunately I don't think clock_gettime exits on other platforms. Consider changing it to gettimeofday.

Octave Code

K = 10;

fileID = fopen('octave_out.txt','w');

for N = 100:100:1000
    AverageTime = 0.0;

    for k = 1:K
        A = randn(N, N);
        A = triu(A) + triu(A, 1)';
        tic;
        eig(A);
        AverageTime = AverageTime + toc/K;
    end

    disp([num2str(N), " ", num2str(AverageTime), "\n"]);
    fprintf(fileID, '%d %f\n', N, AverageTime);
end

fclose(fileID);

C++ Code

#include <iostream>
#include <fstream>
#include <time.h>

#include <gsl/gsl_rng.h>
#include <gsl/gsl_randist.h>
#include <gsl/gsl_eigen.h>
#include <gsl/gsl_vector.h>
#include <gsl/gsl_matrix.h>

int main()
{
    const int K = 10;

    gsl_rng * RandomNumberGenerator = gsl_rng_alloc(gsl_rng_default);
    gsl_rng_set(RandomNumberGenerator, 0);

    std::ofstream OutputFile("atlas.txt", std::ios::trunc);

    for (int N = 100; N <= 1000; N += 100)
    {
        gsl_matrix* A = gsl_matrix_alloc(N, N);
        gsl_eigen_symmv_workspace* EigendecompositionWorkspace = gsl_eigen_symmv_alloc(N);
        gsl_vector* Eigenvalues = gsl_vector_alloc(N);
        gsl_matrix* Eigenvectors = gsl_matrix_alloc(N, N);

        double AverageTime = 0.0;
        for (int k = 0; k < K; k++)
        {   
            for (int i = 0; i < N; i++)
            {
                for (int j = 0; j < N; j++)
                {
                    gsl_matrix_set(A, i, j, gsl_ran_gaussian(RandomNumberGenerator, 1.0));
                }
            }

            timespec start, end;
            clock_gettime(CLOCK_MONOTONIC_RAW, &start);

            gsl_eigen_symmv(A, Eigenvalues, Eigenvectors, EigendecompositionWorkspace);

            clock_gettime(CLOCK_MONOTONIC_RAW, &end);
            double TimeElapsed = (double) ((1e9*end.tv_sec + end.tv_nsec) - (1e9*start.tv_sec + start.tv_nsec))/1.0e9;
            AverageTime += TimeElapsed/K;
            std::cout << "N = " << N << ", k = " << k << ", Time = " << TimeElapsed << std::endl;
        }
        OutputFile << N << " " << AverageTime << std::endl;

        gsl_matrix_free(A);
        gsl_eigen_symmv_free(EigendecompositionWorkspace);
        gsl_vector_free(Eigenvalues);
        gsl_matrix_free(Eigenvectors);
    }

    return 0;
}

909

asked Aug 02 '13 04:08

MGA

2 Answers

I disagree with the previous post. This is not a threading issue, this is an algorithm issue. The reason matlab, R, and octave wipe the floor with C++ libraries is because their C++ libraries use more complex, better algorithms. If you read the octave page you can find out what they do[1]:

Eigenvalues are computed in a several step process which begins with a Hessenberg decomposition, followed by a Schur decomposition, from which the eigenvalues are apparent. The eigenvectors, when desired, are computed by further manipulations of the Schur decomposition.

Solving eigenvalue/eigenvector problems is non-trivial. In fact its one of the few things "Numerical Recipes in C" recommends you don't implement yourself. (p461). GSL is often slow, which was my initial response. ALGLIB is also slow for its standard implementation (I'm getting about 12 seconds!):

#include <iostream>
#include <iomanip>
#include <ctime>

#include <linalg.h>

using std::cout;
using std::setw;
using std::endl;

const int VERBOSE = false;

int main(int argc, char** argv)
{

    int size = 0;
    if(argc != 2) {
        cout << "Please provide a size of input" << endl;
        return -1;
    } else {
        size = atoi(argv[1]);
        cout << "Array Size: " << size << endl;
    }

    alglib::real_2d_array mat;
    alglib::hqrndstate state;
    alglib::hqrndrandomize(state);
    mat.setlength(size, size);
    for(int rr = 0 ; rr < mat.rows(); rr++) {
        for(int cc = 0 ; cc < mat.cols(); cc++) {
            mat[rr][cc] = mat[cc][rr] = alglib::hqrndnormal(state);
        }
    }

    if(VERBOSE) {
        cout << "Matrix: " << endl;
        for(int rr = 0 ; rr < mat.rows(); rr++) {
            for(int cc = 0 ; cc < mat.cols(); cc++) {
                cout << setw(10) << mat[rr][cc];
            }
            cout << endl;
        }
        cout << endl;
    }

    alglib::real_1d_array d;
    alglib::real_2d_array z;
    auto t = clock();
    alglib::smatrixevd(mat, mat.rows(), 1, 0, d, z);
    t = clock() - t;

    cout << (double)t/CLOCKS_PER_SEC << "s" << endl;

    if(VERBOSE) {
        for(int cc = 0 ; cc < mat.cols(); cc++) {
            cout << "lambda: " << d[cc] << endl;
            cout << "V: ";
            for(int rr = 0 ; rr < mat.rows(); rr++) {
                cout << setw(10) << z[rr][cc];
            }
            cout << endl;
        }
    }
}

If you really need a fast library, probably need to do some real hunting.

[1] http://www.gnu.org/software/octave/doc/interpreter/Basic-Matrix-Functions.html

answered Oct 24 '22 14:10

Micah

I have also encountered with the problem. The real cause is that the eig() in matlab doesn't calculate the eigenvectors, but the C version code above does. The different in time spent can be larger than one order of magnitude as shown in the figure below. So the comparison is not fair.

In Matlab, depending on the return value, the actual function called will be different. To force the calculation of eigenvectors, the [V,D] = eig(A) should be used (see codes below).

The actual time to compute eigenvalue problem depends heavily on the matrix properties and the desired results, such as

Real or complex
Hermitian/Symmetric or not
Dense or sparse
Eigenvalues only, Eigenvectors, Maximum eigenvalue only, etc
Serial or parallel

There are algorithms optimized for each of the above case. In the gsl, these algorithm are picked manually, so a wrong selection will decrease performance significantly. Some C++ wrapper class or some language such as matlab and mathematica will choose the optimized version through some methods.

Also, the Matlab and Mathematica have used parallelization. These are further broaden the gap you see by few times, depending on the machine. It is reasonable to say that the calculation of eigenvalues and eigenvectors of a general complex 1000x1000 are about a second and ten second, without parallelization.

Fig. Compare Matlab and C. The "+ vec" means the codes included the calculations of the eigenvectors. The CPU% is the very rough observation of CPU usage at N=1000 which is upper bounded by 800%, though they are supposed to fully use all 8 cores. The gap between Matlab and C are smaller than 8 times.

Fig. Compare different matrix type in Mathematica. Algorithms automatically picked by program.

Matlab (WITH the calculation of eigenvectors)

K = 10;

fileID = fopen('octave_out.txt','w');

for N = 100:100:1000
    AverageTime = 0.0;

    for k = 1:K
        A = randn(N, N);
        A = triu(A) + triu(A, 1)';
        tic;
        [V,D] = eig(A);
        AverageTime = AverageTime + toc/K;
    end

    disp([num2str(N), ' ', num2str(AverageTime), '\n']);
    fprintf(fileID, '%d %f\n', N, AverageTime);
end

fclose(fileID);

C++ (WITHOUT the calculation of eigenvectors)

#include <iostream>
#include <fstream>
#include <time.h>

#include <gsl/gsl_rng.h>
#include <gsl/gsl_randist.h>
#include <gsl/gsl_eigen.h>
#include <gsl/gsl_vector.h>
#include <gsl/gsl_matrix.h>

int main()
{
    const int K = 10;

    gsl_rng * RandomNumberGenerator = gsl_rng_alloc(gsl_rng_default);
    gsl_rng_set(RandomNumberGenerator, 0);

    std::ofstream OutputFile("atlas.txt", std::ios::trunc);

    for (int N = 100; N <= 1000; N += 100)
    {
        gsl_matrix* A = gsl_matrix_alloc(N, N);
        gsl_eigen_symm_workspace* EigendecompositionWorkspace = gsl_eigen_symm_alloc(N);
        gsl_vector* Eigenvalues = gsl_vector_alloc(N);

        double AverageTime = 0.0;
        for (int k = 0; k < K; k++)
        {   
            for (int i = 0; i < N; i++)
            {
                for (int j = i; j < N; j++)
                {
                    double rn = gsl_ran_gaussian(RandomNumberGenerator, 1.0);
                    gsl_matrix_set(A, i, j, rn);
                    gsl_matrix_set(A, j, i, rn);
                }
            }

            timespec start, end;
            clock_gettime(CLOCK_MONOTONIC_RAW, &start);

            gsl_eigen_symm(A, Eigenvalues, EigendecompositionWorkspace);

            clock_gettime(CLOCK_MONOTONIC_RAW, &end);
            double TimeElapsed = (double) ((1e9*end.tv_sec + end.tv_nsec) - (1e9*start.tv_sec + start.tv_nsec))/1.0e9;
            AverageTime += TimeElapsed/K;
            std::cout << "N = " << N << ", k = " << k << ", Time = " << TimeElapsed << std::endl;
        }
        OutputFile << N << " " << AverageTime << std::endl;

        gsl_matrix_free(A);
        gsl_eigen_symm_free(EigendecompositionWorkspace);
        gsl_vector_free(Eigenvalues);
    }

    return 0;
}

Mathematica

(* Symmetric real matrix + eigenvectors *)
Table[{NN, Mean[Table[(
     M = Table[Random[], {i, NN}, {j, NN}];
     M = M + Transpose[Conjugate[M]];
     AbsoluteTiming[Eigensystem[M]][[1]]
     ), {K, 10}]]
  }, {NN, Range[100, 1000, 100]}]

(* Symmetric real matrix *)
Table[{NN, Mean[Table[(
     M = Table[Random[], {i, NN}, {j, NN}];
     M = M + Transpose[Conjugate[M]];
     AbsoluteTiming[Eigenvalues[M]][[1]]
     ), {K, 10}]]
  }, {NN, Range[100, 1000, 100]}]

(* Asymmetric real matrix *)
Table[{NN, Mean[Table[(
     M = Table[Random[], {i, NN}, {j, NN}];
     AbsoluteTiming[Eigenvalues[M]][[1]]
     ), {K, 10}]]
  }, {NN, Range[100, 1000, 100]}]

(* Hermitian matrix *)
Table[{NN, Mean[Table[(
     M = Table[Random[] + I Random[], {i, NN}, {j, NN}];
     M = M + Transpose[Conjugate[M]];
     AbsoluteTiming[Eigenvalues[M]][[1]]
     ), {K, 10}]]
  }, {NN, Range[100, 1000, 100]}]

(* Random complex matrix *)
Table[{NN, Mean[Table[(
     M = Table[Random[] + I Random[], {i, NN}, {j, NN}];
     AbsoluteTiming[Eigenvalues[M]][[1]]
     ), {K, 10}]]
  }, {NN, Range[100, 1000, 100]}]

answered Oct 24 '22 13:10

unsym

Related questions
                            
                                Correctly using QSqlDatabase in multi-threaded programs
                            
                                How to gracefully shutdown a boost asio ssl client?
                            
                                C++ include with full path
                            
                                Does C# have an equivalent to decltype in C++11?
                            
                                What's the correct `enable_if` constraint on perfect forwarding setter?
                            
                                Instance of private nested class c++ [duplicate]
                            
                                Template argument type deduction by conversion operator
                            
                                Why set/map emplace_hint does not return a boolean
                            
                                Visual Studio 2015 C++ app requires api-ms-win-crt-runtime-l1-1-0.dll on client clients
                            
                                Efficient random number generation with C++11 <random>
                            
                                Is this "if e is a pack, then get a template name, otherwise get a variable name" valid or not?
                            
                                Implementing operator<=> for optional<T>
                            
                                Can a friend function in C++ have a default argument whose type has a private destructor?
                            
                                Give a name to a boost thread?
                            
                                C++11 vector push_back ambiguous
                            
                                What versions of gcc, llvm and clang are in XCode 4?
                            
                                How to obtain a pointer out of a C++ vtable?
                            
                                Boost thread Leakage C++
                            
                                Are system() calls evil?
                            
                                Why is it that wcout << ""; is OK but wcout << string(); is not?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why does MATLAB/Octave wipe the floor with C++ in Eigenvalue Problems?

Tags:

c++

performance

matlab

blas

gsl

MGA

People also ask

2 Answers

Micah

unsym

Recent Activity

Donate For Us