Sparse Matrix in Numba

Tags:

I wish to speed up my machine learning algorithm (written in Python) using Numba (http://numba.pydata.org/). Note that this algorithm takes as its input data a sparse matrix. In my pure Python implementation, I used csr_matrix and related classes from Scipy, but apparently it is not compatible with Numba's JIT compiler.

I have also created my own custom class to implement the sparse matrix (which is basically a list of list of (index, value) pair), but again it is incompatible with Numba (i.e., I got some weird error message saying it doesn't recognize extension type)

Is there an alternative, simple way to implement sparse matrix using only numpy (without resorting to SciPy) that is compatible with Numba? Any example code would be appreciated. Thanks!

702

asked Oct 17 '13 06:10

rjo2909

2 Answers

If all you have to do is iterate over the values of a CSR matrix, you can pass the attributes data, indptr, and indices to a function instead of the CSR matrix object.

from scipy import sparse
from numba import njit

@njit
def print_csr(A, iA, jA):
    for row in range(len(iA)-1):
        for i in range(iA[row], iA[row+1]):
            print(row, jA[i], A[i])

A = sparse.csr_matrix([[1, 2, 0], [0, 0, 3], [4, 0, 5]])
print_csr(A.data, A.indptr, A.indices)

197

answered Sep 18 '22 12:09

slek120

You can access the data of your sparse matrix as pure numpy or python. For example

M=sparse.csr_matrix([[1,0,0],[1,0,1],[1,1,1]])
ML = M.tolil()

for d,r in enumerate(zip(ML.data,ML.rows))
    # d,r are lists
    dr = np.array([d,r])
    print dr

produces:

[[1]
 [0]]
[[1 1]
 [0 2]]
[[1 1 1]
 [0 1 2]]

Surely numba can handle code that uses these arrays, provided, of course, that it does not expect each row to have the same size of array.

The lil format stores values 2 object dtype arrays, with data and indices stored lists, by row.

answered Sep 19 '22 12:09

hpaulj

Related questions
                            
                                Postgres closes connection during query after a few hundred seconds when using Psycopg2
                            
                                Can't get "Syntastic" vim plugin to work
                            
                                Python: Convert UTC time-tuple to UTC timestamp
                            
                                Faster matrix power than numpy?
                            
                                Prevent sub-section nesting in Python Sphinx when using toctree
                            
                                Output of cv2.findHomography in OpenCV (Python)
                            
                                AWS EMR perform "bootstrap" script on all the already running machines in cluster
                            
                                How to handle python packages with conflicting names?
                            
                                How do I output a colormap in a scene using pyqt?
                            
                                python is not installing dependencies listed in install_requires of setuptools
                            
                                Add line to pandas plot
                            
                                What is the difference between OneVsRestClassifier with SVC and SVC with decision_function_shape='ovr'?
                            
                                Embed an interactive Bokeh in django views
                            
                                Fatal Python error: initfsencoding: unable to load the file system codec
                            
                                Python: Mock a module without importing it or needing it to exist
                            
                                numpy on multicore hardware
                            
                                How to reliably generate Ethernet frame errors in software?
                            
                                PyGObject in Python 3 on windows
                            
                                Use time elapsed as assertion in unit tests
                            
                                How do I document classes without the module name?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Sparse Matrix in Numba

Tags:

python

numpy

anaconda

scipy

numba

rjo2909

People also ask

2 Answers

slek120

hpaulj

Recent Activity

Donate For Us