Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R memory issue with memory.limit()

I am running some simulations on a machine with 16GB memory. First, I met some errors:

Error: cannot allocate vector of size 6000.1 Mb (the number might be not accurate)

Then I tried to allocate more memory to R by using:

memory.limit(1E10)

The reason of choosing such a big number is because memory.limit could not allow me of selecting a number less than my system total memory

In memory.size(size) : cannot decrease memory limit: ignored

After doing this, I can finish my simulations, but R took around 15GB memory, which stopped my from doing any post analysis.

I used object.size() to estimate the total memory used of all the generated variable, which only took around 10GB. I could not figure where R took the rest of the memory. So my question is how do I reasonably allocate memory to R without exploding my machine? Thanks!

like image 799
TTT Avatar asked Jan 16 '13 06:01

TTT


People also ask

How do I limit memory usage in R?

Use memory. limit() . You can increase the default using this command, memory. limit(size=2500) , where the size is in MB.

Does R have a memory limit?

Under most 64-bit versions of Windows the limit for a 32-bit build of R is 4Gb: for the oldest ones it is 2Gb. The limit for a 64-bit build of R (imposed by the OS) is 8Tb.

How do I increase memory limit in RStudio?

Navigate to this directory C:\Program Files\RStudio\bin then start rstudio.exe using cd . You may need to adapt this depending on where your RStudio folder is located on your computer. Then write --max-mem-size=4GB and press enter. You will need to repeat this every time you want to start an R session.

What is the memory limit in R for 64-bit system?

For a 64-bit versions of R under 64-bit Windows the limit is currently 8Tb. Memory limits can only be increased.


Video Answer


1 Answers

R is interpreted so WYSINAWYG (what you see is not always what you get). As is mentioned in the comments you need more memory that is required by the storage of your objects due to copying of said objects. Also, it is possible that besides being inefficient, nested for loops are a bad idea because gc won't run in the innermost loop. If you have any of these I suggest you try to remove them using vectorised methods, or you manually call gc in your loops to force garbage collections, but be warned this will slow things down somewhat

The issue of memory required for simple objects can be illustrated by the following example. This code grows a data.frame object. Watch the memory use before, after and the resulting object size. There is a lot of garbage that is allowed to accumulate before gc is invoked. I think garbage collection is problematic on Windows than *nix systems. I am not able to replicate the example at the bottom on Mac OS X, but I can repeatedly on Windows. The loop and more explanations can be found in The R Inferno page 13...

# Current memory usage in Mb
memory.size()
# [1] 130.61
n = 1000

# Run loop overwriting current objects
my.df <- data.frame(a=character(0), b=numeric(0))
for(i in 1:n) {
this.N <- rpois(1, 10)
my.df <- rbind(my.df, data.frame(a=sample(letters,
this.N, replace=TRUE), b=runif(this.N)))
}
# Current memory usage afterwards (in Mb)
memory.size()
# [1] 136.34

# BUT... Size of my.df
print( object.size( my.df ) , units = "Mb" )
0.1 Mb
like image 163
Simon O'Hanlon Avatar answered Oct 02 '22 00:10

Simon O'Hanlon