I have a ~20,000x20,000 data, how do i convert the from data.table()
to a matrix
efficiently in terms of speed and memory?
I tried m = as.matrix(dt)
but it takes very long with many warnings. df = data.frame(dt)
takes very long and result in reaching memory limits as well.
Is there any efficient way to do this? Or, simply a function in data.table which returns dt
as as matrix form(as required to feed into a statistical model using the glmnet
package)?
Simply wrapping into as.matrix gives me below error:
x = as.matrix(dt)
Error: cannot allocate vector of size 2.9 Gb
In addition: Warning messages:
1: In unlist(X, recursive = FALSE, use.names = FALSE) : Reached total allocation of 8131Mb: see help(memory.size)
2: In unlist(X, recursive = FALSE, use.names = FALSE) : Reached total allocation of 8131Mb: see help(memory.size)
3: In unlist(X, recursive = FALSE, use.names = FALSE) : Reached total allocation of 8131Mb: see help(memory.size)
4: In unlist(X, recursive = FALSE, use.names = FALSE) : Reached total allocation of 8131Mb: see help(memory.size)
My OS: I have 64 bit Windows7 and 8gb ram, my Windows task manager shows Rgui.exe taking up spaces more than 4gb before and were still fine though.
Convert a Data Frame into a Numeric Matrix in R Programming – data. matrix() Function. data. matrix() function in R Language is used to create a matrix by converting all the values of a Data Frame into numeric mode and then binding them as a matrix.
Memory Usage (Efficiency) data. table is the most efficient when filtering rows. dplyr is far more efficient when summarizing by group while data. table was the least efficient.
You can do so with the "as. matrix" function. e.g.
To convert a table into matrix in R, we can use apply function with as. matrix. noquote function.
Try:
result <- as.matrix(tidytext::cast_sparse(dat_table,
column_name_of_rows,
column_name_of_columns,
column_name_of_values))
It should be very efficient and fast.
@GibsonGay:
I have made an error on my part to include the character column into the matrix, which elevated the matrix's class to character for all columns. Removing this column allowed a integer matrix to be made and it converted successfully without errors/warnings and ran the model fine.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With