I have an issue with the R system()
function (for running an OS command from within R) that only arises when the R session uses up more than some fraction of the available RAM (maybe ~75% in my case), even though there is plenty of RAM available (~15GB in my case) and the same OS command can be easily run at the same time from a terminal.
System info:
64GB RAM PC (local desktop PC, not cloud-based or cluster)
Ubuntu 18.04.1 LTS - x86_64-pc-linux-gnu (64-bit)
R version 3.5.2 (executed directly, not e.g. via docker)
This example demonstrates the issue. The size of the data frame d
needs to be adjusted to be as small as possible and still provoke the error. This will depend on how much RAM you have and what else is running at the same time.
ross@doppio:~$ R
R version 3.5.2 (2018-12-20) -- "Eggshell Igloo"
Copyright (C) 2018 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> n <- 5e8
> d <- data.frame(
+ v0 = rep_len(1.0, n),
+ v1 = rep_len(1.0, n),
+ v2 = rep_len(1.0, n),
+ v3 = rep_len(1.0, n),
+ v4 = rep_len(1.0, n),
+ v5 = rep_len(1.0, n),
+ v6 = rep_len(1.0, n),
+ v7 = rep_len(1.0, n),
+ v8 = rep_len(1.0, n),
+ v9 = rep_len(1.0, n)
+ )
> dim(d)
[1] 500000000 10
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 260857 14.0 627920 33.6 421030 22.5
Vcells 5000537452 38151.1 6483359463 49464.2 5000559813 38151.3
> system("free -m", intern = FALSE)
Warning messages:
1: In system("free -m", intern = FALSE) :
system call failed: Cannot allocate memory
2: In system("free -m", intern = FALSE) : error in running command
The call to gc()
indicates R has allocated ~38GB out of 64 GB RAM and running free -m
in a terminal at the same time (see below) shows that the OS thinks there is ~16GB free.
ross@doppio:~$ free -m
total used free shared buff/cache available
Mem: 64345 44277 15904 461 4162 18896
Swap: 975 1 974
ross@doppio:~$
So free -m
can't be run from within R because memory cannot be allocated, but free -m
can be run at the same time from a terminal, and you would think that 15GB would be enough to run a light-weight command like free -m
.
If the R memory usage is below some threshold then free -m
can be run from within R.
I guess that R is trying allocate an amount of memory for free -m
that is more than actually needed and depends on the amount of memory already allocated. Can anyone shed some light on what is going on here?
Thanks
These 32.6 MB represent the minimum amount of memory used by R. We can see that the sample data created consumes approximately 1 MB of memory since the used memory reported by the memory.size function increased by 1. Even though we removed the object, the memory seems to be still occupied.
Another solution for the error message: “cannot allocate vector of size X Gb” can be the increasing of the memory limit available to R. First, let’s check the memory limit that is currently specified in R.
The command rm (list=ls ()) is expected to release the memory used by all objects, but what it really does is to destroy the pointers to the used memory chunks. The problem is those memory chunks are not immediately freed-up for use by new tasks. Clear Memory in R With the gc Function
The RStudio console shows that our current memory limit is 16267. We can also use the memory.limit function to increase (or decrease) memory limits in R. Let’s increase our memory limit to 35000: Now, we can run the rnorm function again: Nice, this time it worked (even though it took a very long time to compute)!
I've run into this one. R runs fork to run the sub process, temporarily doubling the 35GB image to more than the 64GB you have. If it had lived it would have next called exec and given back the duped memory. This isn't how fork/exec is supposed to go (it is supposed to be copy on write with no extra cost- but somehow it does this in this case).
It looks like this may be known: that to fork you must have enough memory to potentially duplicate the pages (even if that does not happen). I would guess you may not have enough swap (it seems at least the size of RAM is recommended). Here are some instructions on configuring swap (it is for ec2, but covers the use of Linux): https://aws.amazon.com/premiumsupport/knowledge-center/ec2-memory-swap-file/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With