Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Constructing sparse matrix from list of lists of tuples

I have a Python list of row information for a sparse matrix. Each row is represented as a list of (column, value) tuples. Call it alist:

alist = [[(1,10), (3,-3)],
         [(2,12)]]

How can I efficiently construct a scipy sparse matrix from this list of lists, resulting in a matrix like this:

0  10   0  -3
0   0  12   0

The obvious approach is to make a scipy.sparse.lil_matrix, which internally has this "list of lists" structure. But from the scipy.sparse.lil_matrix — SciPy v0.19.0 Reference Guide I see just three ways of constructing them:

  • starting from a dense array
  • starting from another sparse array
  • just constructing an empty array

So the only way to get fresh data in is either to solve this problem with some other sparse matrix representation, or to start with a dense array, neither of which address the initial problem, and both of which seem likely to be less efficient representations than lil_matrix itself for this data.

I guess I can make an empty one, and use a loop to add values, but surely I'm missing something.

The scipy documentation is really frustrating when it comes to sparse matrices.

like image 366
nealmcb Avatar asked Apr 25 '17 19:04

nealmcb


People also ask

How do you construct a sparse matrix?

Description. S = sparse( A ) converts a full matrix into sparse form by squeezing out any zero elements. If a matrix contains many zeros, converting the matrix to sparse storage saves memory. S = sparse( m,n ) generates an m -by- n all zero sparse matrix.

What are tuples of sparse matrix?

Now to keep track of non-zero elements in a sparse matrix we have 3-tuple method using an array. Elements of the first row represent the number of rows, columns and non-zero values in the sparse matrix. Elements of the other rows give information about the location and value of non-zero elements.

Can sparse matrix be implemented using linked list?

Linked List representation of the sparse matrix. In a linked list representation, the linked list data structure is used to represent the sparse matrix. The advantage of using a linked list to represent the sparse matrix is that the complexity of inserting or deleting a node in a linked list is lesser than the array.

What is a sparse matrix explain sparse matrix with example?

Sparse matrix is a matrix which contains very few non-zero elements. When a sparse matrix is represented with a 2-dimensional array, we waste a lot of space to represent that matrix. For example, consider a matrix of size 100 X 100 containing only 10 non-zero elements.


1 Answers

If you have the whole alist before creating the sparse matrix, there is no need to use lil_matrix, since that is optimized for incrementally updating the sparse matrix.

If you want to do any sort of arithmetic with the matrix afterwords, csr_matrix is probably your best choice. You can construct the csr_matrix directly by using (data, (row, column)) format, like this:

In [40]: alist = [[(1,10), (3,-3)],
    ...:          [(2,12)]]

In [41]: i, j, data = zip(*((i, t[0], t[1]) for i, row in enumerate(alist) for t in row))

In [42]: (i, j, data)
Out[42]: ((0, 0, 1), (1, 3, 2), (10, -3, 12))

In [43]: csr_matrix((data, (i, j)), shape=(2, 4)).todense()
Out[43]: 
matrix([[ 0, 10,  0, -3],
        [ 0,  0, 12,  0]], dtype=int64)

If efficiency is a real concern, you can create the csr_matrix internal format (using indptr) directly:

In [57]: indptr = np.cumsum([0] + [len(row) for row in alist])

In [58]: j, data = zip(*(t for row in alist for t in row))

In [59]: csr_matrix((data, j, indptr), shape=(2, 4)).todense()
Out[59]: 
matrix([[ 0, 10,  0, -3],
        [ 0,  0, 12,  0]])

If you're converting to pandas afterwords, coo_matrix is the way to go, since pandas converts to coo_matrix anyway:

In [41]: i, j, data = zip(*((i, t[0], t[1]) for i, row in enumerate(alist) for t in row))

In [43]: coo_matrix((data, (i, j)), shape=(2, 4))
like image 80
kuppern87 Avatar answered Sep 18 '22 20:09

kuppern87