I've created a tdm matrix in R which I want to write to a file. This is a large sparse matrix in simple triplet form, ~ 20,000 x 10,000. When I convert it to a dense matrix to add columns by cbind, I get low memory errors and the process does not complete. I don't want to increase my RAM.
Also, I want to - - bind the tf and tfidf matrices together - save the sparse/dense matrix to csv - run batch machine learning algorithms such as J48 implementation of weka.
How do I save/ load dataset and run the batch ML algorithms within memory constraints?
If I can write a sparse matrix to a data store, can I run ml algorithms in R on a sparse matrix, and within memory constraints?
There could be several solutions:
1) Convert your matrix from double to integer, if you are dealing with integer numbers. Integers needs less memory comparing to double numbers.
2) Try the bigmemory package.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With