Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filling sparse matrix in Eigen is very slow

Tags:

c++

matrix

eigen

I have this code which calls a c++ method in java in a for loop:

JNIEXPORT void JNICALL Java_com_jp_algi_CoreC_MMload(JNIEnv *env3, jobject clazz3, jdoubleArray inputv, jintArray inputi, jint poc, jint pozic)
{
    jdouble* fltv2 ;
    jint* fltind2;
    jsize sizedat = env3->GetArrayLength(inputi);
    fltv2  = new jdouble[sizedat];
    fltind2  = new jint[sizedat];
    jint i;
    jint jm;
    env3->GetIntArrayRegion(inputi,0,sizedat,fltind2);
    env3->GetDoubleArrayRegion(inputv,0,sizedat,fltv2);

    // default is column major
    matA.reserve(VectorXi::Constant(1,sizedat));

    for ( jm = 0; jm < sizedat; jm++) {
        //matA.insert(fltind2[jm],pozic) = fltv2[jm]; // alternative: mat.coeffRef(i,j) += v_ij;
        matA.insert(fltind2[jm],pozic)= fltv2[jm];
        //matA.insertBack(fltind2[jm],pozic)= fltv2[jm];
        //matA.ins
        //matA.insertBackUncompressed();
        //matA.coeffRef(fltind2[jm],pozic) += fltv2[jm];
        // optional
    }

    matA.makeCompressed();

    //k++; //blbe zayklenji!!!
    env3->SetIntArrayRegion(inputi,0,sizedat,fltind2);
    env3->SetDoubleArrayRegion(inputv,0,sizedat,fltv2);
    delete[] fltv2;
    delete[] fltind2;
}

Where inputv are values of columns of matA.; and inputi are indexes of these values.

I read in the docs for eigen that the insert function is the fastest, and it's ok when the number of non-zero coefficients are about 5000. But when I have 25000 it takes 5 sec per column!

I tried insrtback, but the values are the same? What exactly does this command do? Is there some way to improve this code?

Once advantage (maybe): the values and indexes in every column are sorted by the values from highest to lowest...

like image 982
Daniela Agelova Avatar asked Jul 26 '13 09:07

Daniela Agelova


People also ask

What is the problem with sparse matrix?

The problem with representing these sparse matrices as dense matrices is that memory is required and must be allocated for each 32-bit or even 64-bit zero value in the matrix. This is clearly a waste of memory resources as those zero values do not contain any information.

How do you fill a sparse matrix?

To fill a SparseMatrix matrix quickly, the elements should be addressed in a sequence that corresponds to the storage order of the SparseMatrix. By default, the storage order in Eigen's SparseMatrix is column major, but it is easy to change this.

Is sparse matrix more efficient?

Using sparse matrices to store data that contains a large number of zero-valued elements can both save a significant amount of memory and speed up the processing of that data.

How sparse matrix are stored efficiently in memory?

A sparse matrix can be stored in full-matrix storage mode or a packed storage mode. When a sparse matrix is stored in full-matrix storage mode, all its elements, including its zero elements, are stored in an array.


2 Answers

If your sparse matrix is large, you have to allocate sufficient space for matA before filling it. Otherwise, it will take long long time to allocate space and copy data over and over again.

The first thing you need to do is to know the sparsity pattern of your matrix. What I mean by sparsity pattern is the number of non-zeros elements of each column (assume your sparse matrix is in column major). If we store those values in the variable V of type VectorXi, calling matA.reserve(V) is to allocate sufficient memory space.

Following the above procedure, I am able to fill a 47236x677399 sparse matrix (#non-zeros:49556258) within 30 seconds using an ordinary laptop. If I didn't do so, it takes forever...

like image 88
eaudex Avatar answered Sep 25 '22 13:09

eaudex


Important points from Eigen Sparse Matrix Tutorial:

1: SparseMatrix<double> mat(rows,cols);         // default is column major
2: mat.reserve(VectorXi::Constant(cols,x));
3: for each i,j such that v_ij != 0
4:   mat.insert(i,j) = v_ij;                    // alternative:  mat.coeffRef(i,j) += v_ij;
5: mat.makeCompressed();                        // optional

The key ingredient here is line 2 where we reserve room for x non-zeros per column. In many cases, the number of non-zeros per column or row can easily be known in advance. If it varies significantly for each inner vector, then it is possible to specify a reserve size for each inner vector by providing a vector object with an operator[](int j) returning the reserve size of the j-th inner vector (e.g., via a VectorXi or std::vector). If only a rough estimate of the number of nonzeros per inner-vector can be obtained, it is highly recommended to overestimate it rather than the opposite. If this line is omitted, then the first insertion of a new element will reserve room for 2 elements per inner vector.

Line 4 performs a sorted insertion for Column major case. For speed, when filling jth column, existing non-zeros should have row-indices smaller than i. Then, this operation boils down to trivial O(1) operation.

Line 5 suppresses remaining empty space and transforms the matrix into a compressed column storage.

like image 30
Hari Avatar answered Sep 24 '22 13:09

Hari