I want to create a large (say 10^5 x 10^5) sparse circulant matrix in Python. It has 4 elements per row at positions [i,i+1], [i,i+2], [i,i+N-2], [i,i+N-1]
, where I have assumed periodic boundary conditions for the indices (i.e. [10^5,10^5]=[0,0], [10^5+1,10^5+1]=[1,1]
and so on). I looked at the scipy sparse matrices documentation but I am quite confused (I am new to Python).
I can create the matrix with numpy
import numpy as np
def Bc(i, boundary):
"""(int, int) -> int
Checks boundary conditions on index
"""
if i > boundary - 1:
return i - boundary
elif i < 0:
return boundary + i
else:
return i
N = 100
diffMat = np.zeros([N, N])
for i in np.arange(0, N, 1):
diffMat[i, [Bc(i+1, N), Bc(i+2, N), Bc(i+2+(N-5)+1, N), Bc(i+2+(N-5)+2, N)]] = [2.0/3, -1.0/12, 1.0/12, -2.0/3]
However, this is quite slow and for large N
uses a lot of memory, so I want to avoid the creation with numpy and the converting to a sparse matrix and go directly to the latter.
I know how to do it in Mathematica, where one can use SparseArray and index patterns - is there something similar here?
1 Answer. You can use either todense() or toarray() function to convert a CSR matrix to a dense matrix.
Description. S = sparse( A ) converts a full matrix into sparse form by squeezing out any zero elements. If a matrix contains many zeros, converting the matrix to sparse storage saves memory. S = sparse( m,n ) generates an m -by- n all zero sparse matrix.
Matrices that mostly contain zeroes are said to be sparse. Sparse matrices are commonly used in applied machine learning (such as in data containing data-encodings that map categories to count) and even in whole subfields of machine learning such as natural language processing (NLP).
Diagonals containing only zero are omitted. For sparse matrices with very few non-zero diagonals, such as diagonal or tridiagonal matrices, the DIA format allows for very quick arithmetic operations. Its main limitation is that looking up each matrix element requires performing a blind search through the offsets array.
To create a dense circulant matrix, you can use scipy.linalg.circulant
. For example,
In [210]: from scipy.linalg import circulant
In [211]: N = 7
In [212]: vals = np.array([2.0/3, -1.0/12, 1.0/12, -2.0/3])
In [213]: offsets = np.array([1, 2, N-2, N-1])
In [214]: col0 = np.zeros(N)
In [215]: col0[offsets] = -vals
In [216]: c = circulant(col0)
In [217]: c
Out[217]:
array([[ 0. , 0.6667, -0.0833, 0. , 0. , 0.0833, -0.6667],
[-0.6667, 0. , 0.6667, -0.0833, 0. , 0. , 0.0833],
[ 0.0833, -0.6667, 0. , 0.6667, -0.0833, 0. , 0. ],
[ 0. , 0.0833, -0.6667, 0. , 0.6667, -0.0833, 0. ],
[ 0. , 0. , 0.0833, -0.6667, 0. , 0.6667, -0.0833],
[-0.0833, 0. , 0. , 0.0833, -0.6667, 0. , 0.6667],
[ 0.6667, -0.0833, 0. , 0. , 0.0833, -0.6667, 0. ]])
As you point out, for large N
, that requires a lot of memory and most of the values are zero. To create a scipy sparse matrix, you can use scipy.sparse.diags
. We have to create offsets (and corresponding values) for the diagonals above and below the main diagonal:
In [218]: from scipy import sparse
In [219]: N = 7
In [220]: vals = np.array([2.0/3, -1.0/12, 1.0/12, -2.0/3])
In [221]: offsets = np.array([1, 2, N-2, N-1])
In [222]: dupvals = np.concatenate((vals, vals[::-1]))
In [223]: dupoffsets = np.concatenate((offsets, -offsets))
In [224]: a = sparse.diags(dupvals, dupoffsets, shape=(N, N))
In [225]: a.toarray()
Out[225]:
array([[ 0. , 0.6667, -0.0833, 0. , 0. , 0.0833, -0.6667],
[-0.6667, 0. , 0.6667, -0.0833, 0. , 0. , 0.0833],
[ 0.0833, -0.6667, 0. , 0.6667, -0.0833, 0. , 0. ],
[ 0. , 0.0833, -0.6667, 0. , 0.6667, -0.0833, 0. ],
[ 0. , 0. , 0.0833, -0.6667, 0. , 0.6667, -0.0833],
[-0.0833, 0. , 0. , 0.0833, -0.6667, 0. , 0.6667],
[ 0.6667, -0.0833, 0. , 0. , 0.0833, -0.6667, 0. ]])
The matrix is stored in the "diagonal" format:
In [226]: a
Out[226]:
<7x7 sparse matrix of type '<class 'numpy.float64'>'
with 28 stored elements (8 diagonals) in DIAgonal format>
You can use the conversion methods of the sparse matrix to convert it to a different sparse format. For example, the following results in a matrix in CSR format:
In [227]: a.tocsr()
Out[227]:
<7x7 sparse matrix of type '<class 'numpy.float64'>'
with 28 stored elements in Compressed Sparse Row format>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With