Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert Pandas dataframe to Sparse Numpy Matrix directly

I am creating a matrix from a Pandas dataframe as follows:

dense_matrix = np.array(df.as_matrix(columns = None), dtype=bool).astype(np.int) 

And then into a sparse matrix with:

sparse_matrix = scipy.sparse.csr_matrix(dense_matrix) 

Is there any way to go from a df straight to a sparse matrix?

Thanks in advance.

like image 809
user7289 Avatar asked Dec 08 '13 21:12

user7289


People also ask

Can we convert pandas DataFrame to Numpy array?

You can convert pandas dataframe to numpy array using the df. to_numpy() method. Numpy arrays provide fast and versatile ways to normalize data that can be used to clean and scale the data during the training of the machine learning models.

How do you convert a DataFrame to a CSR matrix?

Converting to CSR Matrix To convert a DataFrame to a CSR matrix, you first need to create indices for users and movies. Then, you can perform conversion with the sparse. csr_matrix function. It is a bit faster to convert via a coordinate (COO) matrix.

How do you convert a data frame to a matrix in Python?

A two-dimensional rectangular array to store data in rows and columns is called python matrix. Matrix is a Numpy array to store data in rows and columns. Using dataframe. to_numpy() method we can convert dataframe to Numpy Matrix.


1 Answers

df.values is a numpy array, and accessing values that way is always faster than np.array.

scipy.sparse.csr_matrix(df.values) 

You might need to take the transpose first, like df.values.T. In DataFrames, the columns are axis 0.

like image 140
Dan Allan Avatar answered Sep 18 '22 12:09

Dan Allan