There are at least two sparse matrix packages for R. I'm looking into these because I'm working with datasets that are too big and sparse to fit in memory with a dense representation. I want basic linear algebra routines, plus the ability to easily write C code to operate on them. Which library is the most mature and best to use?
So far I've found
Anyone have experience with this?
From searching around RSeek.org a little bit, the Matrix package seems the most commonly mentioned one. I often think of CRAN Task Views as fairly authoritative, and the Multivariate Task View mentions Matrix and SparseM.
As a general criterion the number of non−zero elements are expected to be equal to the number of rows or number of columns. To create a sparse matrix in R, we can use sparseMatrix function of Matrix package.
Representing a sparse matrix by a 2D array leads to wastage of lots of memory as zeroes in the matrix are of no use in most of the cases. So, instead of storing zeroes with non-zero elements, we only store non-zero elements. This means storing non-zero elements with triples- (Row, Column, value).
Sparse matrix is the one which has most of the elements as zeros as opposed to dense which has most of the elements as non-zeros. Provided with large matrix, it is common that most of the elements are zeros.
Matrix is the most common and has also just been accepted R standard installation (as of 2.9.0), so should be broadly available.
Matrix in base: https://stat.ethz.ch/pipermail/r-announce/2009/000499.html
In my experience, Matrix is the best supported and most mature of the packages you mention. Its C architecture should also be fairly well-exposed and relatively straightforward to work with.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With