Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R parallel computing and zombie processes

This is basically a follow up to the this more specialized question. There have been some posts about the creation of zombie processes when doing parallel computing in R:

  1. How to stop R from leaving zombie processes behind
  2. How to kill a doMC worker when it's done?
  3. Remove zombie processes using parallel package

There are several ways of doing parallel computing and I will focus on the three ways that I have used so far on a local machine. I used doMC and doParallel with the foreachpackage on a local computer with 4cores:

(a) Registering a fork cluster:

library(doParallel)
cl <- makeForkCluster(4)
# equivalently here: cl <- makeForkCluster(nnodes=getOption("mc.cores", 4L))
registerDoParallel(cl)
    out <- foreach(i=1:1000, .combine = "c") %dopar% {
        print(i)
    }
stopCluster(cl)

(b) Registering a PSOCK cluster:

library(doParallel)
cl <- makePSOCKcluster(4)
registerDoParallel(cl)
    out <- foreach(i=1:1000, .combine = "c") %dopar% {
        print(i)
    }
stopCluster(cl)

(c) Using doMC

library(doMC)
library(doParallel)
registerDoMC(4)
    out <- foreach(i=1:1000, .combine = "c") %dopar% {
        print(i)
    }

Several users have observed that when using the doMC method -- which is just a wrapper for the mclapply function so its not doMCs fault (see here: How to kill a doMC worker when it's done?) -- leaves zombie processes behind. In an answer to a previous question (How to stop R from leaving zombie processes behind) it was suggested that using a fork cluster might not leave zombie processes behind. In another question it was suggested (Remove zombie processes using parallel package) that using a PSOCK cluster might not leave zombie processes behind. However, it seems that all three methods leave zombie process behind. While zombie processes per se are usually not a problem because they do (normally) not bind resources they clutter the process tree. Still I might get rid of them by closing and re-opening R but that is not the best option when I'm in the middle of a session. Is there an explanation why this happens (or even: is there a reason why this has to happen)? And is there something to be done so that no zombie processes are left behind?

My system info (R is used in a simple repl session with xterm and tmux):

library(devtools)
> session_info()
Session info-------------------------------------------------------------------
 setting  value                                             
 version  R Under development (unstable) (2014-08-16 r66404)
 system   x86_64, linux-gnu                                 
 ui       X11                                               
 language (EN)                                              
 collate  en_IE.UTF-8                                       
 tz       <NA>                                              

Packages-----------------------------------------------------------------------
 package    * version  source          
 codetools    0.2.8    CRAN (R 3.2.0)  
 devtools   * 1.5.0.99 Github (c429ae2)
 digest       0.6.4    CRAN (R 3.2.0)  
 doMC       * 1.3.3    CRAN (R 3.2.0)  
 evaluate     0.5.5    CRAN (R 3.2.0)  
 foreach    * 1.4.2    CRAN (R 3.2.0)  
 httr         0.4      CRAN (R 3.2.0)  
 iterators  * 1.0.7    CRAN (R 3.2.0)  
 memoise      0.2.1    CRAN (R 3.2.0)  
 RCurl        1.95.4.3 CRAN (R 3.2.0)  
 rstudioapi   0.1      CRAN (R 3.2.0)  
 stringr      0.6.2    CRAN (R 3.2.0)  
 whisker      0.3.2    CRAN (R 3.2.0)  

Small edit: At least for makeForkCluster() it seems that sometimes the forks it spawns get killed and reaped by the parent correctly and sometimes they do not get reaped and become zombies. It seems this only happens when the cluster is not closed fast enough after the loop is aborted or finished; at least that is when it happened the last few times.

like image 895
lord.garbage Avatar asked Aug 19 '14 16:08

lord.garbage


People also ask

Does R support parallel computing?

There are various packages in R which allow parallelization. “parallel” Package The parallel package in R can perform tasks in parallel by providing the ability to allocate cores to R. The working involves finding the number of cores in the system and allocating all of them or a subset to make a cluster.

What are zombie processes?

A zombie process is a process in its terminated state. This usually happens in a program that has parent-child functions. After a child function has finished execution, it sends an exit status to its parent function.

What is the problem with zombie process?

There is one process table per system. The size of the process table is finite. If too many zombie processes are generated, then the process table will be full. That is, the system will not be able to generate any new process, then the system will come to a standstill.

What is another name for zombie processes?

Unsourced material may be challenged and removed. On Unix and Unix-like computer operating systems, a zombie process or defunct process is a process that has completed execution (via the exit system call) but still has an entry in the process table: it is a process in the "Terminated state".


1 Answers

You could get rid of the zombie processes using the "inline" package. Just implement a function that calls "waitpid":

library(inline)
includes <- '#include <sys/wait.h>'
code <- 'int wstat; while (waitpid(-1, &wstat, WNOHANG) > 0) {};'
wait <- cfunction(body=code, includes=includes, convention='.C')

I tested this by first creating some zombies with the mclapply function:

> library(parallel)
> pids <- unlist(mclapply(1:4, function(i) Sys.getpid(), mc.cores=4))
> system(paste0('ps --pid=', paste(pids, collapse=',')))
  PID TTY          TIME CMD
17447 pts/4    00:00:00 R <defunct>
17448 pts/4    00:00:00 R <defunct>
17449 pts/4    00:00:00 R <defunct>
17450 pts/4    00:00:00 R <defunct>

(Note that I'm using the GNU version of "ps" which supports the "--pid" option.)

Then I called my "wait" function and called "ps" again to verify that the zombies are gone:

> wait()
list()
> system(paste0('ps --pid=', paste(pids, collapse=',')))
  PID TTY          TIME CMD

It appears that the worker processes created by mclapply are now gone. This should work as long as the processes were created by the current R process.

like image 84
Steve Weston Avatar answered Oct 05 '22 12:10

Steve Weston