Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R multicore mcfork(): Unable to fork: Cannot allocate memory

Tags:

I'm getting the titular error:

mcfork(): Unable to fork: Cannot allocate memory 

after trying to run a function with mcapply, but top says I'm at 51%

This is on an EC2 instance, but I do have up-to-date R.

Does anyone know what else can cause this error?

Thanks,

-N

like image 748
N. McA. Avatar asked Mar 27 '13 20:03

N. McA.


2 Answers

The issue might be exactly what the error message suggests: there isn't enough memory to fork and create parallel processes.

R essentially needs to create a copy of everything that's in memory for each individual process (to my knowledge it doesn't utilize shared memory). If you are already using 51% of your RAM with a single process, then you don't have enough memory to create a second process since that would required 102% of your RAM in total.

Try:

  1. Using fewer cores - If you were trying to use 4 cores, it's possible you have enough RAM to support 3 parallel threads, but not 4. registerDoMC(2), for example, will set the number of parallel threads to 2 (if you are using the doMC parallel backend).
  2. Using less memory - without seeing the rest of your code, it's hard to suggest ways to accomplish this. One thing that might help is figuring out which R objects are taking up all the memory (Determining memory usage of objects?) and then removing any objects from memory that you don't need (rm(my_big_object))
  3. Adding more RAM - if all else fails, throw hardware at it so you have more capacity.
  4. Sticking to single threading - multithreaded processing in R is a tradeoff of CPU and memory. It sounds like in this case you may not have enough memory to support the CPU power you have, so the best course of action might be to just stick to a single core.
like image 139
Mike Monteiro Avatar answered Sep 17 '22 18:09

Mike Monteiro


R function mcfork is only a wrapper to the syscall fork (BtW, the man page says, that this call is itself a wrapper to the clone)

I created a simple C++ program to test fork's behaviour:

#include <stdio.h> #include <unistd.h>  #include<vector>  int main(int argc, char **argv) {     printf("--beginning of program\n");      std::vector<std::vector<int> > l(50000, std::vector<int>(50000, 0));  //    while (true) {}      int counter = 0;     pid_t pid = fork();     pid = fork();     pid = fork();       if (pid == 0)     {         // child process         int i = 0;         for (; i < 5; ++i)         {             printf("child process: counter=%d\n", ++counter);         }     }     else if (pid > 0)     {         // parent process         int j = 0;         for (; j < 5; ++j)         {             printf("parent process: counter=%d\n", ++counter);         }     }     else     {         // fork failed         printf("fork() failed!\n");         return 1;     }      printf("--end of program--\n");     while (true) {}     return 0; } 

First, the program allocates about 8GB data on heap. Then, it spawns 2^2^2 = 8 children via fork call and waits to be killed by the user, and enters an infinite loop to be easy to spot on task manager.

Here are my observations:

  1. For the fork to succeed, you need to have at least 51% free memory on my system, but this includes swap. You can change this by editing /proc/sys/vm/overcommit_* proc files.
  2. As expected, none of the children take more memory, so this 51% free memory remains free throughout course of the program, and all subsequent forks also don't fail.
  3. The memory is shared between the forks, so it gets reclaimed only after you killed the last child.

Memory fragmentation issue

You should not be concerned about any layer of memory fragmentation with respect to fork. R's memory fragmentation doesn't apply here, because fork operates on virtual memory. You shouldn't worry about fragmentation of physical memory, because virtually all modern operating systems use virtual memory (which consequently enables them to use swap). The only memory fragmentation that might be of issue is a fragmentation of virtual memory space, but AFAIK on Linux virtual memory space is 2^47 which is more than huge, and for many decades you should not have any problems with finding a continuous regions of any practical size.

Summary:

Make sure you have more swap then physical memory, and as long as your computations don't actually need more memory then you have in RAM, you can mcfork them as much as you want.

Or, if you are willing to risk stability (memory starvation) of the whole system, try echo 1 >/proc/sys/vm/overcommit_memory as root on linux.

Or better yet: (more safe)

echo 2 >/proc/sys/vm/overcommit_memory echo 100 >/proc/sys/vm/overcommit_ratio 

You can read more about overcommiting here: https://www.win.tue.nl/~aeb/linux/lk/lk-9.html

like image 39
Adam Ryczkowski Avatar answered Sep 16 '22 18:09

Adam Ryczkowski