Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GC and memory limit issues with R

I am using R on some relatively big data and am hitting some memory issues. This is on Linux. I have significantly less data than the available memory on the system so it's an issue of managing transient allocation.

When I run gc(), I get the following listing

           used   (Mb) gc trigger   (Mb)  max used   (Mb)
Ncells   2147186  114.7    3215540  171.8   2945794  157.4
Vcells 251427223 1918.3  592488509 4520.4 592482377 4520.3

yet R appears to have 4gb allocated in resident memory and 2gb in swap. I'm assuming this is OS-allocated memory that R's memory management system will allocate and GC as needed. However, lets say that I don't want to let R OS-allocate more than 4gb, to prevent swap thrashing. I could always ulimit, but then it would just crash instead of working within the reduced space and GCing more often. Is there a way to specify an arbitrary maximum for the gc trigger and make sure that R never os-allocates more? Or is there something else I could do to manage memory usage?

like image 468
bsdfish Avatar asked Oct 25 '22 10:10

bsdfish


1 Answers

In short: no. I found that you simply cannot micromanage memory management and gc().

On the other hand, you could try to keep your data in memory, but 'outside' of R. The bigmemory makes that fairly easy. Of course, using a 64bit version of R and ample ram may make the problem go away too.

like image 92
Dirk Eddelbuettel Avatar answered Nov 16 '22 13:11

Dirk Eddelbuettel