Is there a way to prevent R from ever using any virtual memory on a unix machine? Whenever it happens it is because I screwed up and I then want to abort the computation.
I am working with a big datasets on a powerful computer shared with several other people. Sometimes I set off commands that requires more RAM than is available, which causes R to start swapping and eventually freeze the whole machine. Normally I can solve this by setting a ulimit
in my ~/.bashrc
ulimit -m 33554432 -v 33554432 # 32 GB RAM of the total 64 GB
which causes R to throw an error and abort when trying to allocate more memory than is available. However, if I make a misstake of this sort when parallelizing (typically using the snow
package) the ulimit
has no effect and the machine crashes anyway. I guess that is because snow
launches the workers as separate processes that are not run in bash. If I instead try to set the ulimit
in my ~/.Rprofile
I just get an error:
> system("ulimit -m 33554432 -v 33554432")
ulimit: 1: too many arguments
Could someone help me figure out a way to accomplish this?
Why can I not set a ulimit
of 0 virtual memory in bash
?
$ ulimit -m 33554432 -v 0
If I do it quickly shuts down.
R holds objects it is using in virtual memory. This help file documents the current design limitations on large objects: these differ between 32-bit and 64-bit builds of R.
8.1. To get around this, Unix uses a technique called virtual memory. It doesn't try to hold all the code and data for a process in memory. Instead, it keeps around only a relatively small working set; the rest of the process's state is left in a special swap space area on your hard disk.
Linux supports virtual memory, that is, using a disk as an extension of RAM so that the effective size of usable memory grows correspondingly. The kernel will write the contents of a currently unused block of memory to the hard disk so that the memory can be used for another purpose.
Entering cat /proc/meminfo in your terminal opens the /proc/meminfo file. This is a virtual file that reports the amount of available and used memory. It contains real-time information about the system's memory usage as well as the buffers and shared memory used by the kernel.
When you run system("ulimit")
that is executing in a child process. The parent does not inherit the ulimit
from the parent. (This is analgous to doing system("cd dir")
, or system("export ENV_VAR=foo")
.
Setting it in the shell from which you launch the environment is the correct way. The limit is not working in the parallel case most likely because it is a per-process limit, not a global system limit.
On Linux you can configure strict(er) overcommit accounting which tries to prevent the kernel from handling out a mmap
request that cannot be backed by physical memory.
This is done by tuning the sysctl parameters vm.overcommit_memory
and vm.overcommit_ratio
. (Google about these.)
This can be an effective way to prevent thrashing situations. But the tradeoff is that you lose the benefit that overcommit provides when things are well-behaved (cramming more/larger processes into memory).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With