I am building an agent-based model with R but I have memory issues by trying to use large objects. In particular, 8 3D arrays are created at initialization and at each time step each 3D array is filled by different functions.
For the moment, the ABM runs over 1825 days and 2500 individuals are simulated to move across the landscape. There are 1000 cells in the landscapes. With this configuration, I don't have memory issues.
At initialization,
1 3D array is like:
h <- array(NA, dim=c(1825, 48, 2500),
dimnames=list(NULL, NULL, as.character(seq(1, 2500, 1))))
## 3th dimension = individual ID
1 3D array is like:
p <- array(NA, dim=c(1825, 38, 1000),
dimnames=list(NULL, NULL, as.character(seq(1, 1000, 1))))
## 3th dimension = cell ID
6 3D arrays are like:
t <- array(NA, dim=c(1825, 41, 2500),
dimnames=list(NULL, NULL, as.character(seq(1, 2500, 1))))
## 3th dimension = individual ID
The arrays contain character/string data types.
Ideally, I would like to increase the number of individuals and/or number of patches, but this is impossible due to memory issues. It seems that there are some tools available like bigmemory
, gc
to manage memory. Are these tools efficient? I’m a beginner in programming and I don’t have experience in managing memory and high performance computing. Any advice is greatly appreciated, thanks for your time.
sessionInfo() R version 3.5.3 (2019-03-11) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1
From my understanding bigmemory
just works on matrices and not multi-dimensional arrays, but you could save a multidimensional array as a list of matrices.
gc
is just the garbace collector and you don't really have to call it, since it will be called automatically, but the manual also states:
It can be useful to call gc after a large object has been removed, as this may prompt R to return memory to the operating system.
I think the most useful package for you're task would be ff
.
Here's a short example to illustrate the strength of the package ff
, which stores data on disk and almost doesn't affect memory.
Initialization arrays with base-R:
p <- array(NA, dim=c(1825, 38, 1000),
dimnames=list(NULL, NULL, as.character(seq(1, 1000, 1))))
format(object.size(p), units="Mb")
"264.6 Mb"
So in total, your initial arrays would take almost up to 5GB memory already, which will get you in trouble with heavy computation.
Initialization arrays with ff:
library(ff)
myArr <- ff(NA, dim=c(1825, 38, 1000),
dimnames=list(NULL, NULL, as.character(seq(1, 1000, 1))),
filename="arr.ffd", vmode="logical", overwrite = T)
format(object.size(myArr), units="Mb")
[1] "0.1 Mb"
Test for equality:
euqals <- list()
for (i in 1:dim(p)[1]) {
euqals[[i]] <- all.equal(p[i,,],
myArr[i,,])
}
all(unlist(euqals))
[1] TRUE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With