No problem:
>>> t = np.array([[1,1,1,1,1],[2,2,2,2,2],[3,3,3,3,3],[4,4,4,4,4],[5,5,5,5,5]])
>>> x = np.arange(5).reshape((-1,1)); y = np.arange(5)
>>> print (t[[x]],t[[y]])
Big problem:
>>> s = scipy.sparse.csr_matrix(t)
>>> print (s[[x]].toarray(),s[[y]].toarray())
Traceback (most recent call last):
File "<pyshell#22>", line 1, in <module>
: :
: :
ValueError: data, indices, and indptr should be rank 1
s.toarray()[[x]]
works great, but defeats the whole purpose of me using sparse matrices as my arrays are too big. I've checked the Attributes and Methods associated with some of the sparse matrices for anything referencing Advanced Indexing, but no dice. Any ideas?
Using sparse matrices to store data that contains a large number of zero-valued elements can both save a significant amount of memory and speed up the processing of that data. sparse is an attribute that you can assign to any two-dimensional MATLAB® matrix that is composed of double or logical elements.
The problem with representing these sparse matrices as dense matrices is that memory is required and must be allocated for each 32-bit or even 64-bit zero value in the matrix. This is clearly a waste of memory resources as those zero values do not contain any information.
Storage: When there is the maximum number of zero elements and the minimum number of non-zero elements then we use a sparse array over a simple array as it requires less memory to store the elements. In the sparse array, we only store the non-zero elements.
Matrices that mostly contain zeroes are said to be sparse. Sparse matrices are commonly used in applied machine learning (such as in data containing data-encodings that map categories to count) and even in whole subfields of machine learning such as natural language processing (NLP).
sparse matrices have a very limited indexing support, and what is available depends on the format of the matrix.
For example:
>>> a = scipy.sparse.rand(100,100,format='coo')
>>> a[2:5, 6:8]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'coo_matrix' object has no attribute '__getitem__'
but
>>> a = scipy.sparse.rand(100,100,format='csc')
>>> a[2:5, 6:8]
<3x2 sparse matrix of type '<type 'numpy.float64'>'
with 0 stored elements in Compressed Sparse Column format>
although
>>> a[2:5:2, 6:8:3]
Traceback (most recent call last):
...
ValueError: slicing with step != 1 not supported
There is also
>>> a = scipy.sparse.rand(100,100,format='dok')
>>> a[2:5:2, 6:8:3]
Traceback (most recent call last):
...
NotImplementedError: fancy indexing supported over one axis only
>>> a[2:5:2,1]
<3x1 sparse matrix of type '<type 'numpy.float64'>'
with 0 stored elements in Dictionary Of Keys format>
And even
>>> a = scipy.sparse.rand(100,100,format='lil')
>>> a[2:5:2,1]
<2x1 sparse matrix of type '<type 'numpy.int32'>'
with 0 stored elements in LInked List format>
C:\Python27\lib\site-packages\scipy\sparse\lil.py:230: SparseEfficiencyWarning: Indexing into a lil_matrix with multiple indices is slow. Pre-converting to CSC or CSR beforehand is more efficient.
SparseEfficiencyWarning)
>>> a[2:5:2, 6:8:3]
<2x1 sparse matrix of type '<type 'numpy.int32'>'
with 0 stored elements in LInked List format>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With