Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R system() cannot allocate memory even though the same command can be run from a terminal

Tags:

linux

r

I have an issue with the R system() function (for running an OS command from within R) that only arises when the R session uses up more than some fraction of the available RAM (maybe ~75% in my case), even though there is plenty of RAM available (~15GB in my case) and the same OS command can be easily run at the same time from a terminal.

System info:
64GB RAM PC (local desktop PC, not cloud-based or cluster)
Ubuntu 18.04.1 LTS - x86_64-pc-linux-gnu (64-bit)
R version 3.5.2 (executed directly, not e.g. via docker)

This example demonstrates the issue. The size of the data frame d needs to be adjusted to be as small as possible and still provoke the error. This will depend on how much RAM you have and what else is running at the same time.

ross@doppio:~$ R

R version 3.5.2 (2018-12-20) -- "Eggshell Igloo"
Copyright (C) 2018 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> n <- 5e8
> d <- data.frame(
+   v0 = rep_len(1.0, n),
+   v1 = rep_len(1.0, n),
+   v2 = rep_len(1.0, n),
+   v3 = rep_len(1.0, n),
+   v4 = rep_len(1.0, n),
+   v5 = rep_len(1.0, n),
+   v6 = rep_len(1.0, n),
+   v7 = rep_len(1.0, n),
+   v8 = rep_len(1.0, n),
+   v9 = rep_len(1.0, n)
+ )

> dim(d)
[1] 500000000        10

> gc()
             used    (Mb) gc trigger    (Mb)   max used    (Mb)
Ncells     260857    14.0     627920    33.6     421030    22.5
Vcells 5000537452 38151.1 6483359463 49464.2 5000559813 38151.3

> system("free -m", intern = FALSE)
Warning messages:
1: In system("free -m", intern = FALSE) :
  system call failed: Cannot allocate memory
2: In system("free -m", intern = FALSE) : error in running command

The call to gc() indicates R has allocated ~38GB out of 64 GB RAM and running free -m in a terminal at the same time (see below) shows that the OS thinks there is ~16GB free.

ross@doppio:~$ free -m
              total        used        free      shared  buff/cache   available
Mem:          64345       44277       15904         461        4162       18896
Swap:           975           1         974
ross@doppio:~$ 

So free -m can't be run from within R because memory cannot be allocated, but free -m can be run at the same time from a terminal, and you would think that 15GB would be enough to run a light-weight command like free -m.

If the R memory usage is below some threshold then free -m can be run from within R.

I guess that R is trying allocate an amount of memory for free -m that is more than actually needed and depends on the amount of memory already allocated. Can anyone shed some light on what is going on here?

Thanks

like image 979
Ross Gayler Avatar asked Jan 04 '19 06:01

Ross Gayler


People also ask

What is the minimum amount of memory used by R?

These 32.6 MB represent the minimum amount of memory used by R. We can see that the sample data created consumes approximately 1 MB of memory since the used memory reported by the memory.size function increased by 1. Even though we removed the object, the memory seems to be still occupied.

Why cannot allocate vector of Size X GB in R?

Another solution for the error message: “cannot allocate vector of size X Gb” can be the increasing of the memory limit available to R. First, let’s check the memory limit that is currently specified in R.

How to clear memory in R with RM?

The command rm (list=ls ()) is expected to release the memory used by all objects, but what it really does is to destroy the pointers to the used memory chunks. The problem is those memory chunks are not immediately freed-up for use by new tasks. Clear Memory in R With the gc Function

How do I increase the memory limit in R?

The RStudio console shows that our current memory limit is 16267. We can also use the memory.limit function to increase (or decrease) memory limits in R. Let’s increase our memory limit to 35000: Now, we can run the rnorm function again: Nice, this time it worked (even though it took a very long time to compute)!


Video Answer


1 Answers

I've run into this one. R runs fork to run the sub process, temporarily doubling the 35GB image to more than the 64GB you have. If it had lived it would have next called exec and given back the duped memory. This isn't how fork/exec is supposed to go (it is supposed to be copy on write with no extra cost- but somehow it does this in this case).

It looks like this may be known: that to fork you must have enough memory to potentially duplicate the pages (even if that does not happen). I would guess you may not have enough swap (it seems at least the size of RAM is recommended). Here are some instructions on configuring swap (it is for ec2, but covers the use of Linux): https://aws.amazon.com/premiumsupport/knowledge-center/ec2-memory-swap-file/

like image 135
John Mount Avatar answered Oct 22 '22 02:10

John Mount