I am trying to take advantage of a quad-core machine by parallelizing a costly operation that is performed on a list of about 1000 items.
I am using R's parallel::mclapply function currently:
res = rbind.fill(parallel::mclapply(lst, fun, mc.cores=3, mc.preschedule=T))
Which works. Problem is, any additional subprocess that is spawned has to allocate a large chunk of memory:
Ideally, I would like each core to access shared memory from the parent R process, so that as I increase the number of cores used in mclapply, I don't hit RAM limitations before core limitations.
I'm currently at a loss on how to debug this issue. All of the large data structures that each process accesses are globals (currently). Is that somehow the issue?
I did increase my shared memory max setting for the OS to 20 GB (available RAM):
$ cat /etc/sysctl.conf
kern.sysv.shmmax=21474836480
kern.sysv.shmall=5242880
kern.sysv.shmmin=1
kern.sysv.shmmni=32
kern.sysv.shmseg=8
kern.maxprocperuid=512
kern.maxproc=2048
I thought that would fix things, but the issue still occurs.
Any other ideas?
Just the tip what might have been going on R-devel Digest, Vol 149, Issue 22
Radford Neal's answer from Jul 26, 2015:
When mclapply forks to start a new process, the memory is initially shared with the parent process. However, a memory page has to be copied whenever either process writes to it. Unfortunately, R's garbage collector writes to each object to mark and unmark it whenever a full garbage collection is done, so it's quite possible that every R object will be duplicated in each process, even though many of them are not actually changed (from the point of view of the R programs).
Linux and macosx have a copy-on-write mechanism when forking, it means that the pages of memory are not actually copied, but shared until the first write. mclapply is based on fork(), so probably (unless you write to your big shared data), the memory that you see reported by your process lister is not actual memory.
But, when collecting the results, the master process will have to allocate memory for each returned result of the mclapply.
To help you further, we would need to know more about your fun function.
I guess I would have thought this would not have used extra memory because of the copy-on-write functionality. I take it the elements of the list are large? Perhaps when R passes the elements to fun() it is actually making a copy of the list item instead of using copy on write. If so, the following might work better:
fun <- function(itemNumber){
myitem <- lst[[itemNumber]]
# now do your computations
}
res = rbind.fill(parallel::mclapply(1:length(lst), fun, mc.cores=3, mc.preschedule=T))
Or use lst[[itemNumber]]
directly in your function. If R/Linux/macos isn't smart enough to use copy-on-write as you wrote the function, it may with this modified approach.
Edit: I assume you are not modifying the items in the list. If you do, R is going to make copies of the data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With