for (; 1;) {
if (fork() == 0) break;
int sig = 0;
for (; 1; usleep(10000)) {
pid_t wpid = waitpid(g->pid[1], &sig, WNOHANG);
if (wpid > 0) break;
if (wpid < 0) print("wait error: %s\n", strerror(errno));
}
}
waitpid
should return the pid of child process immediately!
But waitpid
got the pid number after about 90 seconds,
cube 28139 0.0 0.0 70576 900 ? Ss 04:24 0:07 ./daemon -d
cube 28140 9.3 0.0 0 0 ? Zl 04:24 106:19 [daemon] <defunct>
strace -p 28139
Process 28139 attached - interrupt to quit
restart_syscall(<... resuming interrupted call ...>) = 0
wait4(28140, 0x7fff08a2681c, WNOHANG, NULL) = 0
nanosleep({0, 10000000}, NULL) = 0
wait4(28140, 0x7fff08a2681c, WNOHANG, NULL) = 0
--- SIGCHLD (Child exited) @ 0 (0) ---
restart_syscall(<... resuming interrupted call ...>) = 0
wait4(28140, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGKILL}], WNOHANG, NULL) = 28140
I finally find out there were some fd leaks during deep tracing by lsof.
After fd leaks were fixed, the problem was gone.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With