I have in my code the following expression:
a = (b / x[:, np.newaxis]).sum(axis=1)
where b
is an ndarray of shape (M, N)
, and x
is an ndarray of shape (M,)
. Now, b
is actually sparse, so for memory efficiency I would like to substitute in a scipy.sparse.csc_matrix
or csr_matrix
. However, broadcasting in this way is not implemented (even though division or multiplication is guaranteed to maintain sparsity) (the entries of x
are non-zero), and raises a NotImplementedError
. Is there a sparse
function I'm not aware of that would do what I want? (dot()
would sum along the wrong axis.)
Returns a copy of column i of the matrix, as a (m x 1) CSR matrix (column vector). Format of a matrix representation as a string. Maximum number of elements to display when printed. Number of stored values, including explicit zeros.
CSC is more efficient at accessing column-vectors or column operations, generally, as it is stored as arrays of columns and their value at each row. CSR matrices are the opposite; stored as arrays of rows and their values at each column, and are more efficient at accessing row-vectors or row operations.
We use the multiply() method provided in both csc_matrix and csr_matrix classes to multiply two sparse matrices. We can multiply two matrices of same format( both matrices are csc or csr format) and also of different formats ( one matrix is csc and other is csr format).
csc_matrix. This is the traditional format for specifying a sparse matrix in MATLAB (via the sparse function).
If b
is in CSC format, then b.data
has the non-zero entries of b
, and b.indices
has the row index of each of the non-zero entries, so you can do your division as:
b.data /= np.take(x, b.indices)
It's hackier than Warren's elegant solution, but it will probably also be faster in most settings:
b = sps.rand(1000, 1000, density=0.01, format='csc')
x = np.random.rand(1000)
def row_divide_col_reduce(b, x):
data = b.data.copy() / np.take(x, b.indices)
ret = sps.csc_matrix((data, b.indices.copy(), b.indptr.copy()),
shape=b.shape)
return ret.sum(axis=1)
def row_divide_col_reduce_bis(b, x):
d = sps.spdiags(1.0/x, 0, len(x), len(x))
return (d * b).sum(axis=1)
In [2]: %timeit row_divide_col_reduce(b, x)
1000 loops, best of 3: 210 us per loop
In [3]: %timeit row_divide_col_reduce_bis(b, x)
1000 loops, best of 3: 697 us per loop
In [4]: np.allclose(row_divide_col_reduce(b, x),
...: row_divide_col_reduce_bis(b, x))
Out[4]: True
You can cut the time almost in half in the above example if you do the division in-place, i.e.:
def row_divide_col_reduce(b, x):
b.data /= np.take(x, b.indices)
return b.sum(axis=1)
In [2]: %timeit row_divide_col_reduce(b, x)
10000 loops, best of 3: 131 us per loop
To implement a = (b / x[:, np.newaxis]).sum(axis=1)
, you can use a = b.sum(axis=1).A1 / x
. The A1
attribute returns the 1D ndarray, so the result is a 1D ndarray, not a matrix
. This concise expression works because you are both scaling by x
and summing along axis 1. For example:
In [190]: b
Out[190]:
<3x3 sparse matrix of type '<type 'numpy.float64'>'
with 5 stored elements in Compressed Sparse Row format>
In [191]: b.A
Out[191]:
array([[ 1., 0., 2.],
[ 0., 3., 0.],
[ 4., 0., 5.]])
In [192]: x
Out[192]: array([ 2., 3., 4.])
In [193]: b.sum(axis=1).A1 / x
Out[193]: array([ 1.5 , 1. , 2.25])
More generally, if you want to scale the rows of a sparse matrix with a vector x
, you could multiply b
on the left with a sparse matrix containing 1.0/x
on the diagonal. The function scipy.sparse.spdiags
can be used to create such a matrix. For example:
In [71]: from scipy.sparse import csc_matrix, spdiags
In [72]: b = csc_matrix([[1,0,2],[0,3,0],[4,0,5]], dtype=np.float64)
In [73]: b.A
Out[73]:
array([[ 1., 0., 2.],
[ 0., 3., 0.],
[ 4., 0., 5.]])
In [74]: x = array([2., 3., 4.])
In [75]: d = spdiags(1.0/x, 0, len(x), len(x))
In [76]: d.A
Out[76]:
array([[ 0.5 , 0. , 0. ],
[ 0. , 0.33333333, 0. ],
[ 0. , 0. , 0.25 ]])
In [77]: p = d * b
In [78]: p.A
Out[78]:
array([[ 0.5 , 0. , 1. ],
[ 0. , 1. , 0. ],
[ 1. , 0. , 1.25]])
In [79]: a = p.sum(axis=1)
In [80]: a
Out[80]:
matrix([[ 1.5 ],
[ 1. ],
[ 2.25]])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With