Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Exceeding memory limit in R (even with 24GB RAM)

I am trying to merge two dataframes: one has 908450 observations of 33 variables, and the other has 908450 observations of 2 variables.

dataframe2 <-merge(dataframe1, dataframe2, by="id")

I've cleared all other dataframes from working memory, and reset my memory limit (for a brand new desktop with 24 GB of RAM) using the code:

memory.limit(24576)

But, I'm still getting the error Cannot allocate vector of size 173.Mb.

Any thoughts on how to get around this problem?

like image 524
roody Avatar asked Jul 19 '12 16:07

roody


People also ask

How do I increase memory limit in R studio?

Windows users may get the error that R has run out of memory. If you have R already installed and subsequently install more RAM, you may have to reinstall R in order to take advantage of the additional capacity.

What do you do when R runs out of memory?

You can force R to perform this check, and free the memory right away, by running the gc() command in R or going to Tools -> Memory -> Free Unused R Memory.

Does R have a memory limit?

Under most 64-bit versions of Windows the limit for a 32-bit build of R is 4Gb: for the oldest ones it is 2Gb. The limit for a 64-bit build of R (imposed by the OS) is 8Tb.

How do I check my memory limit in R?

Determining Your Memory Limits in R RAM is capped at ~3.5GB in x32 Windows systems, and at the RAM installed in x64 Windows (W7/W8/W10) / MAC OS / Linux-build CPUs. Two calls, memory. limit() and memory. size() return the amount of RAM in your CPU, and how much is being used by your current R session, respectively.


2 Answers

To follow up on my comments, use data.table. I put together a quick example matching your data to illustrate:

library(data.table)

dt1 <- data.table(id = 1:908450, matrix(rnorm(908450*32), ncol = 32))
dt2 <- data.table(id = 1:908450, rnorm(908450))
#set keys
setkey(dt1, id)
setkey(dt2, id)
#check dims
> dim(dt1)
[1] 908450     33
> dim(dt2)
[1] 908450      2
#merge together and check system time:
> system.time(dt3 <- dt1[dt2])
   user  system elapsed 
   0.43    0.03    0.47 

So it took less than 1/2 second to merge together. I took a before and after screenshot watching my memory. Before the merge, I was using 3.4 gigs of ram. When I merged together, it jumped to 3.7 and leveled off. I think you'll be hard pressed to find something more memory or time efficient than that.

Before: enter image description here

After:enter image description here

like image 129
Chase Avatar answered Oct 04 '22 18:10

Chase


As far as I can think of there's three solutions:

  • Use datatables
  • Use swap memory ( can be adjustable on *nix machines)
  • Use sampling
like image 26
user974514 Avatar answered Oct 04 '22 19:10

user974514