I have recently discovered the wonders of the packages bigmemory
, ff
and filehash
to handle very large matrices.
How can I handle very large (300MB++) lists? In my work I work with these lists all day every day. I can do band-aid solution with save()
& load()
hacks everywhere but I would prefer a bigmemory
-like solution. Something like a bigmemory
bigmatrix
would be ideal, where I work with it basically identically to a matrix
except it takes up somethign like 660 bytes in my RAM.
These lists are mostly >1000
length lists of lm()
objects (or similar regression objects). For example,
Y <- rnorm(1000) ; X <- rnorm(1000)
A <- lapply(1:6000, function(i) lm(Y~X))
B <- lapply(1:6000, function(i) lm(Y~X))
C <- lapply(1:6000, function(i) lm(Y~X))
D <- lapply(1:6000, function(i) lm(Y~X))
E <- lapply(1:6000, function(i) lm(Y~X))
F <- lapply(1:6000, function(i) lm(Y~X))
In my project I will have A,B,C,D,E,F
-type lists (and even more than this) that I have to work with interactively.
If these were gigantic matrices there is a tonne of support. I was wondering if there was any similar support in any package for large list
objects.
You can store and access lists on disk using the filehash package. This should work (if rather slowly on my machine...):
Y <- rnorm(1000) ; X <- rnorm(1000)
# set up disk object
library(filehash)
dbCreate("myTestDB")
db <- dbInit("myTestDB")
db$A <- lapply(1:6000, function(i) lm(Y~X))
db$B <- lapply(1:6000, function(i) lm(Y~X))
db$C <- lapply(1:6000, function(i) lm(Y~X))
db$D <- lapply(1:6000, function(i) lm(Y~X))
db$E <- lapply(1:6000, function(i) lm(Y~X))
db$F <- lapply(1:6000, function(i) lm(Y~X))
List items can be accessed using the [
function. See here for more details: http://cran.r-project.org/web/packages/filehash/vignettes/filehash.pdf
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With