Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compressed Sparse Column (CSC) or Compressed Sparse Row (CSR) sparse matrix?

I have a design matrix I'm converting in a sparse matrix using the scipy module

It have many rows and only few columns

With this shape, is it better to use the CSC or the CSR design ? Or are they strictly equivalent for the execution speed ?

Basically, it looks like this example : (But there is many more rows in the true one)

enter image description here

Thanks !

like image 886
Covich Avatar asked Sep 09 '15 14:09

Covich


1 Answers

You can readily convert one format to the other (.tocsc(),.tocsr()). In fact M.T for a csr just creates a csc with the same data.

In a number of cases sparse functions convert a matrix to another format to perform certain actions. In other cases it gives an 'efficiency' warning if the format isn't the best. (beware, warnings appear only once per run.)

If you are iterating over columns, or selecting mostly by column, csc is better with converse true for csr. For math, matrix products and such, they are equivalent.

Create the matrix one way, and do a few timing tests for typical operations.

like image 113
hpaulj Avatar answered Oct 01 '22 14:10

hpaulj