I have a design matrix I'm converting in a sparse matrix using the scipy module
It have many rows and only few columns
With this shape, is it better to use the CSC or the CSR design ? Or are they strictly equivalent for the execution speed ?
Basically, it looks like this example : (But there is many more rows in the true one)
Thanks !
You can readily convert one format to the other (.tocsc()
,.tocsr()
). In fact M.T
for a csr
just creates a csc
with the same data.
In a number of cases sparse
functions convert a matrix to another format to perform certain actions. In other cases it gives an 'efficiency' warning if the format isn't the best. (beware, warnings appear only once per run.)
If you are iterating over columns, or selecting mostly by column, csc
is better with converse true for csr
. For math, matrix products and such, they are equivalent.
Create the matrix one way, and do a few timing tests for typical operations.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With