Given a Scipy CSC Sparse matrix "sm" with dimensions (170k x 170k) with 440 million non-null points and a sparse CSC vector "v" (170k x 1) with a few non-null points, is there anything that can be done to improve the performance of the operation: <pre class="prettyprint"><code>resul = sm.dot(v) </code></pre> ? Currently it's taking roughly 1 second. Initializing the matrices as CSR increased the time up to 3 seconds, so CSC performed better. SM is a matrix of similarities between products and V is the vector that represents which products the user bought or clicked on. So for every user sm is the same. I'm using Ubuntu 13.04, Intel i3 @3.4GHz, 4 Cores. Researching on SO I read about Ablas package. I typed into the terminal: <pre class="prettyprint"><code>~$ ldd /usr/lib/python2.7/dist-packages/numpy/core/_dotblas.so </code></pre> Which resulted in: <pre class="prettyprint"><code> linux-vdso.so.1 => (0x00007fff56a88000) libblas.so.3 => /usr/lib/libblas.so.3 (0x00007f888137f000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f8880fb7000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f8880cb1000) /lib64/ld-linux-x86-64.so.2 (0x00007f888183c000) </code></pre> And for what I understood this means that I'm already using a high performance package from Ablas. I'm still not sure though if this package already implements parallel computing but it looks like it doesn't. Could multi-core processing help to boost performance? If so, is there any library that could be helpful in python? I was also considering the idea of implementing this in Cython but I don't know if this would lead to good results. Thanks in advance.

Recently i had the same issue. I solved it like this. <pre class="prettyprint"><code>def sparse_col_vec_dot(csc_mat, csc_vec): curr_mat = csc_mat.tocsr() ret curr_mat* csc_vec </code></pre> The trick here is we have to make one version of the matrix as row representation and the other version as column representation.

Improving Performance of Multiplication of Scipy Sparse Matrices

Tags:

performance

python

scipy

matrix-multiplication

sparse-matrix

Given a Scipy CSC Sparse matrix "sm" with dimensions (170k x 170k) with 440 million non-null points and a sparse CSC vector "v" (170k x 1) with a few non-null points, is there anything that can be done to improve the performance of the operation:

resul = sm.dot(v)

Currently it's taking roughly 1 second. Initializing the matrices as CSR increased the time up to 3 seconds, so CSC performed better.

SM is a matrix of similarities between products and V is the vector that represents which products the user bought or clicked on. So for every user sm is the same.

I'm using Ubuntu 13.04, Intel i3 @3.4GHz, 4 Cores.

Researching on SO I read about Ablas package. I typed into the terminal:

~$ ldd /usr/lib/python2.7/dist-packages/numpy/core/_dotblas.so

Which resulted in:

    linux-vdso.so.1 =>  (0x00007fff56a88000)
    libblas.so.3 => /usr/lib/libblas.so.3 (0x00007f888137f000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f8880fb7000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f8880cb1000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f888183c000)

And for what I understood this means that I'm already using a high performance package from Ablas. I'm still not sure though if this package already implements parallel computing but it looks like it doesn't.

Could multi-core processing help to boost performance? If so, is there any library that could be helpful in python?

I was also considering the idea of implementing this in Cython but I don't know if this would lead to good results.

Thanks in advance.

236

asked Sep 03 '13 15:09

Willian Fuks

2 Answers

Recently i had the same issue. I solved it like this.

def sparse_col_vec_dot(csc_mat, csc_vec):
    curr_mat = csc_mat.tocsr()
    ret curr_mat* csc_vec

The trick here is we have to make one version of the matrix as row representation and the other version as column representation.

200

answered Nov 07 '22 21:11

Vaali

The sparse matrix multiplication routines are directly coded in C++, and as far as a quick look at the source reveals, there doesn't seem to be any hook to any optimized library. Furthermore, it doesn't seem to be taking advantage of the fact that the second matrix is a vector to minimize calculations. So you can probably speed things up quite a bit by accessing the guts of the sparse matrix, and customizing the multiplication algorithm. The following code does so in pure Python/Numpy, and if the vector really has "a few non-null points" it matches the speed of scipy's C++ code: if you implemented it in Cython, the speed increase should be noticeable:

def sparse_col_vec_dot(csc_mat, csc_vec):
    # row numbers of vector non-zero entries
    v_rows = csc_vec.indices
    v_data = csc_vec.data
    # matrix description arrays
    m_dat = csc_mat.data
    m_ind = csc_mat.indices
    m_ptr = csc_mat.indptr
    # output arrays
    sizes = m_ptr.take(v_rows+1) - m_ptr.take(v_rows)
    sizes = np.concatenate(([0], np.cumsum(sizes)))
    data = np.empty((sizes[-1],), dtype=csc_mat.dtype)
    indices = np.empty((sizes[-1],), dtype=np.intp)
    indptr = np.zeros((2,), dtype=np.intp)

    for j in range(len(sizes)-1):
        slice_ = slice(*m_ptr[[v_rows[j] ,v_rows[j]+1]])
        np.multiply(m_dat[slice_], v_data[j], out=data[sizes[j]:sizes[j+1]])
        indices[sizes[j]:sizes[j+1]] = m_ind[slice_]
    indptr[-1] = len(data)
    ret = sps.csc_matrix((data, indices, indptr),
                         shape=csc_vec.shape)
    ret.sum_duplicates()

    return ret

A quick explanation of what is going on: a CSC matrix is defined in three linear arrays:

data contains the non-zero entries, stored in column major order.
indices contains the rows of the non-zero entries.
indptr has one entry more than the number of columns of the matrix, and items in column j are found in data[indptr[j]:indptr[j+1]] and are in rows indices[indptr[j]:indptr[j+1]].

So to multiply by a sparse column vector, you can iterate over data and indices of the column vector, and for each (d, r) pair, extract the corresponding column of the matrix and multiply it by d, i.e. data[indptr[r]:indptr[r+1]] * d and indices[indptr[r]:indptr[r+1]].

answered Nov 07 '22 19:11

Jaime

Related questions
                            
                                Getting IP address from HTTP POST request using Python
                            
                                Kivy installation does not find GL/gl.h?
                            
                                Does Django have an equivalent of Rails's "bundle install"?
                            
                                Override function declaration in autodoc for sphinx
                            
                                python - returning a default value
                            
                                Python MySQLdb: Query parameters as a named dictionary [closed]
                            
                                Why Tor cant access localhost pages
                            
                                Outerzip / zip longest function (with multiple fill values)
                            
                                Python equivalent of Matlab textscan
                            
                                Python: How can I define a class in a doctest?
                            
                                Python- insert a character into a string
                            
                                web2py - allow external access - how?
                            
                                Why am I getting a "no module named cx_Freeze" error after installing cx_freeze?
                            
                                Pandas column addition/subtraction
                            
                                How to avoid a Broken Pipe error when printing a large amount of formatted data?
                            
                                Why does Python's itertools.cycle need to create a copy of the iterable?
                            
                                Pandas: aggregate when column contains numpy arrays
                            
                                How to pickle unicodes and save them in utf-8 databases
                            
                                How to enable history in Django shell in python
                            
                                TypeError("'bool' object is not iterable",) when trying to return a Boolean

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With