Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R sweep on a Sparse Matrix

I'm attempting to apply the sweep function to a sparse matrix (dgCMatrix). Unfortunately, when I do that I get a memory error. It seems that sweep is expanding my sparse matrix to a full dense matrix.

If there an easy way to perform this function without if blowing up my memory?

This is what I'm trying to do.

sparse_matrix <- sweep(sparse_matrix, 1, vector_to_multiply, '*')
like image 884
BDow Avatar asked Dec 09 '25 09:12

BDow


1 Answers

I'm working with a big and very sparse dgTMatrix matrix (200k rows and 10k columns) in a NLP problem. After hours thinking in a good solution, I created an alternative sweep function for sparse matrices. It is very fast and memory efficient. It took just 1 second and less than 1G of memory to multiply all matrix rows by a array of weights. For margin = 1 it works for both dgCMatrix and dgTMatrix.

Here it follows:

sweep_sparse <- function(x, margin, stats, fun = "*") {
   f <- match.fun(fun)
   if (margin == 1) {
      idx <- x@i + 1
   } else {
      idx <- x@j + 1
   }
   x@x <- f(x@x, stats[idx])
   return(x)
}
like image 144
David Pinto Avatar answered Dec 11 '25 00:12

David Pinto



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!