I was wondering what a good way to fire an event based on the memory usage in R would be. Say i'm combining a bunch of files into one master file but the size of the entire master file may be too large to hold in memory. When I approach the memory limit i'd like to save my current master file and free memory.
master <- NULL
partnum <- 1
threshold <- 0.8
filelist <- list.files(mypath)
for (filename in filelist)
{
filedata <- read.csv(filename)
if (is.null(master)) master <- filedata
else master <- rbind(master,filedata)
rm(filedata)
# test for memory usage here
# if (usedMemory > availableMemory * threshold)
# then do the following else go to top of loop
save(master,file=paste(mypath,partnum,"rData",sep="."))
master <- NULL
partnum <- partnum + 1
}
What i'd like to do is be able to calculate the amount of memory available on the machine. That way the event would fire dynamically based on current machine usage. Say when the script is initiated there is 10GB available on the machine, so clean up when 8GB is used. However say mid-execution another user starts a program that consumes 5GB, then i'd like to clean up when 4GB is used.
> x <- 1:10^9
> memory.size()
[1] 3832.26
> memory.limit()
[1] 16381
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 164953 8.9 350000 18.7 350000 18.7
Vcells 500150216 3815.9 669246830 5106.0 550150069 4197.4
At this point on my machine I only have 10GB availble due other processes consuming 2GB
You might want to try memory.size
. Maybe something like this:
# Are we using more than 1 GB?
if (memory.size() > 1000) {
# Force a garbage collect and check again
gc()
if (memory.size() > 1000) {
# free up memory...
}
}
The call to memory.size
does not do a garbage collect, so you could either always do it before calling or conditionally as in the example above (garbage collection can take some time).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With