Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R Cannot allocate memory though memory seems to be available

After running several models I need to run a system() command on my R script to shutdown my EC2 instance, but when I get to that point I get:

cannot popen 'ls', probable reason 'Cannot allocate memory'

Note: for this question I even tried ls which did not work

The flow of my script is the following

  • Load Model (about 2GB)
  • Mine documents and write to a MySQL database

The above steps are repeated around 20 times with different models with an average size of 2GB each

  • Terminate the instance

At this point is when I need to call system("sudo shutdown -h now") and nothing happens, but when I try system("sudo shutdown -h now",intern=TRUE) I get the allocation error.

I tried rm() for all my objects just before calling the shutdown, but the same error persists.

Here is some data on my system which is a large EC2 Ubuntu instance

R version 2.15.1 (2012-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] splines   stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] RTextTools_1.3.9   tau_0.0-15         glmnet_1.8         Matrix_1.0-6      
 [5] lattice_0.20-10    maxent_1.3.2       Rcpp_0.9.13        caTools_1.13      
 [9] bitops_1.0-4.1     ipred_0.8-13       prodlim_1.3.2      KernSmooth_2.23-8 
[13] survival_2.36-14   mlbench_2.1-1      MASS_7.3-21        rpart_3.1-54      
[17] e1071_1.6-1        class_7.3-4        tm_0.5-7.3         nnet_7.3-4        
[21] tree_1.0-31        randomForest_4.6-6 SparseM_0.96       RMySQL_0.9-3      
[25] ggplot2_0.9.1      DBI_0.2-5         

loaded via a namespace (and not attached):
 [1] colorspace_1.1-2   dichromat_1.2-4    digest_0.5.2       grid_2.15.1       
 [5] labeling_0.2       memoise_0.1        munsell_0.3        plyr_1.7.1        
 [9] proto_0.3-9.2      RColorBrewer_1.0-5 reshape2_1.2.1     scales_0.2.1      
[13] slam_0.1-25        stringr_0.6.1    

gc() returns

          used (Mb) gc trigger   (Mb)  max used   (Mb)
Ncells 1143171 61.1    5234604  279.6   5268036  281.4
Vcells 1055057  8.1  465891772 3554.5 767962930 5859.1

I noticed that if I run just 1 model instead of the 20 it works fine, so it might be that memory is not getting free after each run although I did rm() the used objects

I also noticed that if I close R and restart it and then call system() it works. If there is a way to restart R within R then maybe I can add that to my script.sh flow.

Which would be the appropriate way of cleaning all of my objects and letting the memory free for each loop so when I need to call the system() commands there is no memory issue?

Any tip in the right direction will be much appreciated! Thanks

like image 898
JordanBelf Avatar asked Sep 07 '12 17:09

JordanBelf


People also ask

How to fix R error cannot allocate vector of size?

The simplest solution is to avoid using overly large objects or excessively large numbers of them in one program or R session, for example try removing unneeded objects from your calculation to better fit within your total allocation of memory.

How do I free unused memory in R?

You can force R to perform this check, and free the memory right away, by running the gc() command in R or going to Tools -> Memory -> Free Unused R Memory.

How do I clear my R cache?

You can do both by restarting your R session in RStudio with the keyboard shortcut Ctrl+Shift+F10 which will totally clear your global environment of both objects and loaded packages.


1 Answers

I'm just posting this because it's too long to fit in the comments. Since you haven't included any code, it's pretty hard to give advice. But, here is some code that maybe you can think about.

wd <- getwd()
assign('.First', function(x) {
  require('plyr') #and whatever other packages you're using
  file.remove(".RData") #already been loaded
  rm(".Last", pos=.GlobalEnv) #otherwise won't be able to quit R without it restarting
  setwd(wd)
}, pos=.GlobalEnv)
assign(".Last", function() {
  system("R --no-site-file --no-init-file --quiet")
}, pos=.GlobalEnv)
save.image() #or only save the things you want to be reloaded.
q("no")

The idea is that you save the things you need in a file called .RData. You create a .Last function that will be run when you quit R. The .Last function will start a new session of R. And you create a .First function that will be run as soon as R is restarted. The .First function will load packages you need and clean up.

Now, you can quit R and it will restart loading the things you need.

(q("no") means don't save, but you already saved everything you need in .RData which will be loaded when it restarts)

like image 96
GSee Avatar answered Sep 19 '22 06:09

GSee