In R, I'm trying to work with a large matrix (39,146,166 rows by 127 columns) and I'm having memory issues with a number of operations on it. I've determined that about 35% of the entries in the matrix are non-zero, and the remainder are all zeros. Is this sparse enough that I would save some memory representing this matrix using one of R's sparse matrix classes? What is a good rule of thumb for determining when a matrix is worth representing sparsely?
I don't think the sparse representation will be that much more compact. You need three numbers for each numeric item other than an implicit zero. So even if two of those are 4 byte integers the space in memory will still be larger than a "serial" storage strategy.
By this reasoning anything above 50% will take more storage space, but I'm posting from an iPhone under SF Bay so cannot test with 'object.size'.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With