Error in scipy sparse diags matrix construction

Tags:

When using scipy.sparse.spdiags or scipy.sparse.diags I have noticed want I consider to be a bug in the routines eg

scipy.sparse.spdiags([1.1,1.2,1.3],1,4,4).toarray()

returns

array([[ 0. ,  1.2,  0. ,  0. ],
       [ 0. ,  0. ,  1.3,  0. ],
       [ 0. ,  0. ,  0. ,  0. ],
       [ 0. ,  0. ,  0. ,  0. ]])

That is for positive diagonals it drops the first k data. One might argue that there is some grand programming reason for this and that I just need to pad with zeros. OK annoying as that may be, one can use scipy.sparse.diags which gives the correct result. However this routine has a bug that can't be worked around

scipy.sparse.diags([1.1,1.2],0,(4,2)).toarray()

gives

array([[ 1.1,  0. ],
       [ 0. ,  1.2],
       [ 0. ,  0. ],
       [ 0. ,  0. ]])

nice, and

scipy.sparse.diags([1.1,1.2],-2,(4,2)).toarray()

gives

array([[ 0. ,  0. ],
       [ 0. ,  0. ],
       [ 1.1,  0. ],
       [ 0. ,  1.2]])

but

scipy.sparse.diags([1.1,1.2],-1,(4,2)).toarray()

gives an error saying ValueError: Diagonal length (index 0: 2 at offset -1) does not agree with matrix size (4, 2). Obviously the answer is

array([[ 0. ,  0. ],
       [ 1.1,  0. ],
       [ 0. ,  1.2],
       [ 0. ,  0. ]])

and for extra random behaviour we have

scipy.sparse.diags([1.1],-1,(4,2)).toarray()

giving

array([[ 0. ,  0. ],
       [ 1.1,  0. ],
       [ 0. ,  1.1],
       [ 0. ,  0. ]])

Anyone know if there is a function for constructing diagonal sparse matrices that actually works?

845

asked Nov 10 '15 04:11

user3799584

1 Answers

Executive summary: spdiags works correctly, even if the matrix input isn't the most intuitive. diags has a bug that affects some offsets in rectangular matrices. There is a bug fix on scipy github.

The example for spdiags is:

>>> data = array([[1,2,3,4],[1,2,3,4],[1,2,3,4]])
>>> diags = array([0,-1,2])
>>> spdiags(data, diags, 4, 4).todense()
matrix([[1, 0, 3, 0],
        [1, 2, 0, 4],
        [0, 2, 3, 0],
        [0, 0, 3, 4]])

Note that the 3rd column of data always appears in the 3rd column of the sparse. The other columns also line up. But they are omitted where they 'fall off the edge'.

The input to this function is a matrix, while the input to diags is a ragged list. The diagonals of the sparse matrix all have different numbers of values. So the specification has to accomodate this in one or other. spdiags does this by ignoring some values, diags by taking a list input.

The sparse.diags([1.1,1.2],-1,(4,2)) error is puzzling.

the spdiags equivalent does work:

In [421]: sparse.spdiags([[1.1,1.2]],-1,4,2).A
Out[421]: 
array([[ 0. ,  0. ],
       [ 1.1,  0. ],
       [ 0. ,  1.2],
       [ 0. ,  0. ]])

The error is raised in this block of code:

for j, diagonal in enumerate(diagonals):
    offset = offsets[j]
    k = max(0, offset)
    length = min(m + offset, n - offset)
    if length <= 0:
        raise ValueError("Offset %d (index %d) out of bounds" % (offset, j))
    try:
        data_arr[j, k:k+length] = diagonal
    except ValueError:
        if len(diagonal) != length and len(diagonal) != 1:
            raise ValueError(
                "Diagonal length (index %d: %d at offset %d) does not "
                "agree with matrix size (%d, %d)." % (
                j, len(diagonal), offset, m, n))
        raise

The actual matrix constructor in the diags is:

dia_matrix((data_arr, offsets), shape=(m, n))

This is the same constructor that spdiags uses, but without any manipulation.

In [434]: sparse.dia_matrix(([[1.1,1.2]],-1),shape=(4,2)).A
Out[434]: 
array([[ 0. ,  0. ],
       [ 1.1,  0. ],
       [ 0. ,  1.2],
       [ 0. ,  0. ]])

In dia format, the inputs are stored exactly as given by spdiags (complete with that matrix with extra values):

In [436]: M.data
Out[436]: array([[ 1.1,  1.2]])
In [437]: M.offsets
Out[437]: array([-1], dtype=int32)

As @user2357112 points out, length = min(m + offset, n - offset is wrong, producing 3 in the test case. Changing it to length = min(m + k, n - k) makes all cases for this (4,2) matrix work. But it fails with the transpose: diags([1.1,1.2], 1, (2, 4))

The correction, as of Oct 5, for this issue is:

https://github.com/pv/scipy-work/commit/529cbde47121c8ed87f74fa6445c05d71353eb6c

length = min(m + offset, n - offset, min(m,n))

With this fix, diags([1.1,1.2], 1, (2, 4)) works.

165

answered Sep 30 '22 15:09

hpaulj

Related questions
                            
                                Using numba.jit with scipy.integrate.ode
                            
                                Is it possible to output to and monitor streams other than stdin, stdout & stderr? (python)
                            
                                sklearn agglomerative clustering input data
                            
                                Segfault on 2nd connection with pyodbc to mirrored MS SQL Server
                            
                                slice pandas df based on n consecutive instances of element
                            
                                Error opening sqlite table using pandas
                            
                                keep a continuous mongo connection active using pymongo
                            
                                Change python interpretor mid-script
                            
                                Python generators; two apparently identical programs work differently
                            
                                Getting current frame with OpenCV VideoCapture in Python
                            
                                Getting broken pipe when passing mysql connection to a python thread
                            
                                Interprogram communication in python on Linux
                            
                                Pandas efficient check if column contains string in other column
                            
                                Python how to scan string and and use upper() between two characters?
                            
                                Scrape just the text, within an html element that has a class, using beautiful soup
                            
                                tornado + momoko doesn't handle connection
                            
                                How to check subsequence exists in a list? [duplicate]
                            
                                jquery month picker: setting an initial min/max range conflicts with "from" <"to" function
                            
                                Is there any kind of subspace clustering package available in scikit-learn
                            
                                what PID does os.killpg look for

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Error in scipy sparse diags matrix construction

Tags:

python

scipy

sparse-matrix

user3799584

People also ask

1 Answers

hpaulj

Recent Activity

Donate For Us