Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to stop R from leaving zombie processes behind

Here is a little reproducible example:

library(doMC)
library(doParallel)
registerDoMC(4)
    timing <- system.time( fitall <- foreach(i=1:1000, .combine = "c") %dopar% {
                print(i)
            })

I start up R and look at the process table:

> system("ps -efl")
F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
4 S chbr         1     0  5  80   0 - 21399 wait   10:58 ?        00:00:00 /usr/local/lib/R/bin/exec/R --no-save --no-restore
0 S chbr         9     1  0  80   0 -  1113 wait   10:58 ?        00:00:00 sh -c ps -efl
0 R chbr        10     9  0  80   0 -  4294 -      10:58 ?        00:00:00 ps -efl

If I use the aformentioned simple for loop doMC or doParallel leave a zombie process behind. Output of ps -efl after running the loop:

> system("ps -efl")
F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
4 S chbr         1     0  4  80   0 - 25256 wait   11:00 ?        00:00:00 /usr/local/lib/R/b
1 Z chbr        10     1  0  80   0 -     0 exit   11:00 ?        00:00:00 [R] <defunct>
0 S chbr        12     1  0  80   0 -  1113 wait   11:00 ?        00:00:00 sh -c ps -efl
0 R chbr        13    12  0  80   0 -  4294 -      11:00 ?        00:00:00 ps -efl

If I repeat the loop without issuing registerDoMC(4) again no additional zombie process gets created. However, if I issue registerDoMC(4) an additional zombie process gets created:

> system("ps -efl")
F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
4 S chbr         1     0  0  80   0 - 25554 wait   11:00 ?        00:00:01 /usr/local/lib/R/b
1 Z chbr        21     1  0  80   0 -     0 exit   11:02 ?        00:00:00 [R] <defunct>
1 Z chbr        22     1  0  80   0 -     0 exit   11:02 ?        00:00:00 [R] <defunct>
0 S chbr        26     1  0  80   0 -  1113 wait   11:03 ?        00:00:00 sh -c ps -efl
0 R chbr        27    26  0  80   0 -  4294 -      11:03 ?        00:00:00 ps -efl

That's how I figured it could be doMC which is doing something that should not be done. If doMC is causing this is there a way to stop doMC from leaving zombie processes behind? (stopCluster() does not work as no cluster gets created in the first place.)

> sessionInfo()
R Under development (unstable) (2014-08-16 r66404)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_IE.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_IE.UTF-8        LC_COLLATE=en_IE.UTF-8    
 [5] LC_MONETARY=en_IE.UTF-8    LC_MESSAGES=en_IE.UTF-8   
 [7] LC_PAPER=en_IE.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_IE.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] doParallel_1.0.8 doMC_1.3.3       iterators_1.0.7  foreach_1.4.2   

loaded via a namespace (and not attached):
[1] codetools_0.2-8 compiler_3.2.0
like image 658
lord.garbage Avatar asked Aug 17 '14 11:08

lord.garbage


People also ask

How to prevent the creation of zombie processes?

Hence, we need to prevent the creation of zombie processes. 1. Using wait () system call: When the parent process calls wait (), after the creation of a child, it indicates that, it will wait for the child to complete and it will reap the exit status of the child.

How do I Kill a zombie process in Linux?

The init process regularly performs the necessary cleanup of zombies, so to kill them, you just have to kill the process that created them. The top command is a convenient way to see if you have any zombies. This system has eight zombie processes. We can list these by using the ps command and piping it into egrep.

How to find the process ID of a zombie process?

The zombie processes are listed. This is a neater way to discover the process IDs of zombies than scrolling back and forth through top. We also see that an application called “badprg” spawned these zombies. The process ID of the first zombie is 7641, but we need to find the process ID of its parent process. We can do so by using ps again.

Why can’t i remove the zombie state from the PCB?

The PCB and the entry in the process table won’t be removed when the child process terminates. This results in the zombie state never being removed from the PCB. Zombies do use a bit of memory, but they don’t usually pose a problem. The entry in the process table is small, but, until it’s released, the process ID can’t be reused.


1 Answers

This really has nothing to do with foreach or doMC; as Steve Weston has pointed out in answer to other StackOverflow queries, doMC is essentially just a wrapper for mclapply, and you can see zombie processes created with a simple call to mclapply:

library(parallel)
mclapply(rep(5,4), rnorm)

On my system, this leaves two zombie processes:

[richcalaway@richcalaway-pc ~]$ ps -efl | grep defunct
1 Z 1660945517 28701 28624  0 77  0 -     0 exit   12:00 pts/1    00:00:00 [R] <defunct>
1 Z 1660945517 28702 28624  0 78  0 -     0 exit   12:00 pts/1    00:00:00 [R] <defunct>
0 S 1660945517 28704 28308  0 78  0 - 15306 pipe_w 12:00 pts/2    00:00:00 grep defunct

Under normal circumstances, these zombie processes won't cause any trouble, and they do disappear when the R session ends. You can avoid them by using doParallel and a fork cluster instead of using doMC.

Cheers,

Rich Calaway

Principal Program Manager

Revolution Analytics

like image 97
Rich Calaway Avatar answered Nov 15 '22 09:11

Rich Calaway