Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to manage memory in agent-based modeling with R

I am building an agent-based model with R but I have memory issues by trying to use large objects. In particular, 8 3D arrays are created at initialization and at each time step each 3D array is filled by different functions.

For the moment, the ABM runs over 1825 days and 2500 individuals are simulated to move across the landscape. There are 1000 cells in the landscapes. With this configuration, I don't have memory issues.

At initialization,

  • 1 3D array is like:

    h <- array(NA, dim=c(1825, 48, 2500),
               dimnames=list(NULL, NULL, as.character(seq(1, 2500, 1))))
               ## 3th dimension = individual ID
    
  • 1 3D array is like:

    p <- array(NA, dim=c(1825, 38, 1000),
               dimnames=list(NULL, NULL, as.character(seq(1, 1000, 1))))
               ## 3th dimension = cell ID
    
  • 6 3D arrays are like:

    t <- array(NA, dim=c(1825, 41, 2500),
               dimnames=list(NULL, NULL, as.character(seq(1, 2500, 1))))
               ## 3th dimension = individual ID
    

The arrays contain character/string data types.

Ideally, I would like to increase the number of individuals and/or number of patches, but this is impossible due to memory issues. It seems that there are some tools available like bigmemory, gc to manage memory. Are these tools efficient? I’m a beginner in programming and I don’t have experience in managing memory and high performance computing. Any advice is greatly appreciated, thanks for your time.

sessionInfo() R version 3.5.3 (2019-03-11) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1

like image 834
Nell Avatar asked May 31 '19 22:05

Nell


1 Answers

From my understanding bigmemory just works on matrices and not multi-dimensional arrays, but you could save a multidimensional array as a list of matrices.

gc is just the garbace collector and you don't really have to call it, since it will be called automatically, but the manual also states:

It can be useful to call gc after a large object has been removed, as this may prompt R to return memory to the operating system.

I think the most useful package for you're task would be ff. Here's a short example to illustrate the strength of the package ff, which stores data on disk and almost doesn't affect memory.

Initialization arrays with base-R:

p <- array(NA, dim=c(1825, 38, 1000),
           dimnames=list(NULL, NULL, as.character(seq(1, 1000, 1))))

format(object.size(p), units="Mb")

"264.6 Mb"

So in total, your initial arrays would take almost up to 5GB memory already, which will get you in trouble with heavy computation.


Initialization arrays with ff:

library(ff)
myArr <- ff(NA, dim=c(1825, 38, 1000), 
            dimnames=list(NULL, NULL, as.character(seq(1, 1000, 1))),
            filename="arr.ffd", vmode="logical", overwrite = T)

format(object.size(myArr), units="Mb")

[1] "0.1 Mb"


Test for equality:

euqals <- list()
for (i in 1:dim(p)[1]) {
  euqals[[i]] <-  all.equal(p[i,,],
                            myArr[i,,])
}
all(unlist(euqals))

[1] TRUE

like image 156
SeGa Avatar answered Oct 19 '22 04:10

SeGa