I have 3 sparse matrices:
In [39]:
mat1
Out[39]:
(1, 878049)
<1x878049 sparse matrix of type '<type 'numpy.int64'>'
with 878048 stored elements in Compressed Sparse Row format>
In [37]:
mat2
Out[37]:
(1, 878049)
<1x878049 sparse matrix of type '<type 'numpy.int64'>'
with 744315 stored elements in Compressed Sparse Row format>
In [35]:
mat3
Out[35]:
(1, 878049)
<1x878049 sparse matrix of type '<type 'numpy.int64'>'
with 788618 stored elements in Compressed Sparse Row format>
From the documentation, I read that it is possible to hstack
, vstack
, and concatenate
them such type of matrices. So I tried to hstack
them:
import numpy as np
matrix1 = np.hstack([[address_feature, dayweek_feature]]).T
matrix2 = np.vstack([[matrix1, pddis_feature]]).T
X = matrix2
However, the dimensions do not match:
In [41]:
X_combined_features.shape
Out[41]:
(2, 1)
Note that I am stacking such matrices since I would like to use them with a scikit-learn classification algorithm. Therefore, How should I hstack
a number of different sparse matrices?.
A matrix is a two-dimensional data object made of m rows and n columns, therefore having total m x n values. If most of the elements of the matrix have 0 value, then it is called a sparse matrix. Why to use Sparse Matrix instead of simple matrix ? Attention reader! Don’t stop learning now.
Operations on Sparse Matrices Difficulty Level : Medium Last Updated : 06 Jan, 2020 Given two sparse matrices (Sparse Matrix and its representations | Set 1 (Using Arrays and Linked Lists)), perform operations such as add, multiply or transpose of the matrices in their sparse form itself.
Linked list representation; Method 1: Using Arrays . 2D array is used to represent a sparse matrix in which there are three rows named as . Row: Index of row, where non-zero element is located; Column: Index of column, where non-zero element is located; V ...
sparse format of the result (e.g., “csr”) by default an appropriate sparse matrix format is returned. This choice is subject to change. dtypedtype, optional The data-type of the output matrix. If not given, the dtype is determined from that of blocks. See also vstack stack sparse matrices vertically (row wise) Examples
Use the sparse
versions of vstack
. As general rule you need to use sparse functions and methods, not the numpy
ones with similar name. sparse
matrices are not subclasses of numpy
ndarray
.
But, your 3 three matrices do not look sparse. They are 1x878049. One has 878048 nonzero elements - that means just one 0 element.
So you could just as well turned them into dense arrays (with .toarray()
or .A
) and use np.hstack
or np.vstack
.
np.hstack([address_feature.A, dayweek_feature.A])
And don't use the double brackets. All concatenate functions take a simple list or tuple of the arrays. And that list can have more than 2 arrays
In [296]: A=sparse.csr_matrix([0,1,2,0,0,1])
In [297]: B=sparse.csr_matrix([0,0,0,1,0,1])
In [298]: C=sparse.csr_matrix([1,0,0,0,1,0])
In [299]: sparse.vstack([A,B,C])
Out[299]:
<3x6 sparse matrix of type '<class 'numpy.int32'>'
with 7 stored elements in Compressed Sparse Row format>
In [300]: sparse.vstack([A,B,C]).A
Out[300]:
array([[0, 1, 2, 0, 0, 1],
[0, 0, 0, 1, 0, 1],
[1, 0, 0, 0, 1, 0]], dtype=int32)
In [301]: sparse.hstack([A,B,C]).A
Out[301]: array([[0, 1, 2, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0]], dtype=int32)
In [302]: np.vstack([A.A,B.A,C.A])
Out[302]:
array([[0, 1, 2, 0, 0, 1],
[0, 0, 0, 1, 0, 1],
[1, 0, 0, 0, 1, 0]], dtype=int32)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With