Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert sparse matrix (csc_matrix) to pandas dataframe

I want to convert this matrix into a pandas dataframe. csc_matrix

The first number in the bracket should be the index, the second number being columns and the number in the end being the data.

I want to do this to do feature selection in text analysis, the first number represents the document, the second being the feature of word and the last number being the TFIDF score.

Getting a dataframe helps me to transform the text analysis problem into data analysis.

like image 348
Miya Wang Avatar asked Apr 13 '16 02:04

Miya Wang


People also ask

How do you convert a sparse matrix to dense?

1 Answer. You can use either todense() or toarray() function to convert a CSR matrix to a dense matrix.

How do you create a sparse DataFrame in Python?

Use DataFrame. sparse. from_spmatrix() to create a DataFrame with sparse values from a sparse matrix.


1 Answers

from scipy.sparse import csc_matrix

csc = csc_matrix(np.array(
    [[0, 0, 4, 0, 0, 0],
     [1, 0, 0, 0, 2, 0],
     [2, 0, 0, 1, 0, 0],
     [0, 0, 0, 0, 0, 1],
     [4, 0, 3, 2, 0, 0]]))

# Return a Coordinate (coo) representation of the Compresses-Sparse-Column (csc) matrix.
coo = csc.tocoo(copy=False)

# Access `row`, `col` and `data` properties of coo matrix.
>>> pd.DataFrame({'index': coo.row, 'col': coo.col, 'data': coo.data}
                 )[['index', 'col', 'data']].sort_values(['index', 'col']
                 ).reset_index(drop=True)
   index  col  data
0      0    2     4
1      1    0     1
2      1    4     2
3      2    0     2
4      2    3     1
5      3    5     1
6      4    0     4
7      4    2     3
8      4    3     2
like image 104
Alexander Avatar answered Oct 17 '22 07:10

Alexander