fork() leaking? Taking longer and longer to fork a simple process

Tags:

I have a system in which two identical processes are run (let's call them replicas). When signaled, a replica will duplicate itself by using the fork() call. A third process selects one of the processes to kill randomly, and then signals the other to create a replacement. Functionally, the system works well; it can kill / respawn replicas all day except for the performance issue.

The fork() call is taking longer and longer. The following is the simplest setup that still displays the problem. The timing be is displayed in the graph below: fork timing

The replica's code is the following:

void restartHandler(int signo) {
// fork
  timestamp_t last = generate_timestamp();
  pid_t currentPID = fork();


  if (currentPID >= 0) { // Successful fork
    if (currentPID == 0) { // Child process
      timestamp_t current = generate_timestamp();
      printf("%lld\n", current - last);

      // unblock the signal
      sigset_t signal_set;
      sigemptyset(&signal_set);
      sigaddset(&signal_set, SIGUSR1);
      sigprocmask(SIG_UNBLOCK, &signal_set, NULL);

      return;
    } else {   // Parent just returns
      waitpid(-1, NULL, WNOHANG);
      return;
    }
  } else {
    printf("Fork error!\n");
    return;
  }
}

int main(int argc, const char **argv) {
  if (signal(SIGUSR1, restartHandler) == SIG_ERR) {
    perror("Failed to register the restart handler");
    return -1;
  }

  while(1) {
    sleep(1);
  }

  return 0;
}

The longer the system runs, the worse it gets.

Sorry to lack a specific question, but does anyone have any idea / clues as to what is going on? It seems to me that there is a resource leak in the kernel (thus the linux-kernel tag), but I don't know where where to start looking.

What I have tried:

Tried kmemleak, which did not catch anything. This implies that if there is some memory "leak" that it is still reachable.
/proc/<pid>/maps is not growing.
Currently running the 3.14 kernel with RT patch (note this happens with non-rt and rt processes), and have also tried on 3.2.
zombie processes are not an issue. I have tried a version in which I setup another process as a subreaper using prctl
I first noticed this slowdown in a system in which the timing measurements are being down outside of the restarted process; same behavior.

Any hints? Anything I can provide to help? Thanks!

416

asked Dec 08 '14 23:12

superdesk

2 Answers

The slowdown is caused by an accumulation of anonymous vmas, and is a known problem. The problem is evident when there are a large number of fork() calls and the parent exits before the children. The following code recreates the problem (source Daniel Forrest):

#include <unistd.h>

int main(int argc, char *argv[])
{
  pid_t pid;
  while (1) {
    pid = fork();
    if (pid == -1) {
      /* error */
      return 1;
    }
    if (pid) {
      /* parent */
      sleep(2);
      break;
    }
    else {
      /* child */
      sleep(1);
    }
  }
  return 0;
}

The behavior can be confirmed by checking anon_vma in /proc/slabinfo.

There is a patch (source) which limits the length of copied anon_vma_chain to five. I can confirm that the patch fixes the problem.

As for how I eventually found the problem, I finally just started putting printk calls throughout the fork code, checking the times shown in dmesg. Eventually I saw that it was the call to anon_vma_fork which was taking longer and longer. Then it was a quick matter of google searching.

It took a rather long time, so I would still appreciate any suggestions for a better way to have gone about tracking down the problem. And to all of those that already spent time trying to assist me, Thank You.

answered Nov 15 '22 19:11

superdesk

Maybe you could try using the generic wait() call, rather than waitpid()? It's just a guess, but I heard it was better from a professor in undergrad. Also, have you tried using address sanitizer

Also, you can use GDB to debug a child process as well (if you haven't already tried that). You can use follow-fork-mode:

set follow-fork-mode child

but that is only capable of debugging the parent. You can debug both by getting the pid of the child process, calling sleep() after forking then:

attach <child process pid>

then call:

detach

This is useful because you can dump memory leaks into valgrind. Just call valgrind with

valgrind --vgdb-error=0...<executable>

then set some relevant breakpoints, and continue through your program until you hit your breakpoints then search for leaks:

monitor leak_check full reachable any

then:

monitor block_list <loss_record_nr>

answered Nov 15 '22 18:11

J_COL

Related questions
                            
                                How to determine a wifi channel number used by wifi ap/network?
                            
                                Why is it important for C / C++ Code to be compilable on different compilers?
                            
                                Best practices: Where should function comments go in C/C++ code? [closed]
                            
                                C and C++ programming on Ubuntu 11.10 [closed]
                            
                                Why is it thought of 'T *name' to be the C way and 'T* name' to be the C++ way?
                            
                                Why are empty expressions legal in C/C++?
                            
                                GCC: Empty program == 23202 bytes?
                            
                                Write a program that will print "C" if compiled as an (ANSI) C program, and "C++" if compiled as a C++ program
                            
                                Are multiple conditional operators in this situation a good idea?
                            
                                Function list on main page with doxygen
                            
                                How can I monitor an application's API calls on a jailbroken iOS device?
                            
                                Can we access the Microphone driver of my android phone
                            
                                Valid programs in C89, but not in C99
                            
                                Efficient computation of 2**64 / divisor via fast floating-point reciprocal
                            
                                Is there a known O(nm)-time/O(1)-space algorithm for POSIX filename matching (fnmatch)?
                            
                                Fastest way to Find a m x n submatrix in M X N matrix
                            
                                Does casting a T pointer to a T' pointer and back yield the original pointer if T' is an incomplete type?
                            
                                Is it possible to accelerate clang-tidy using ccache or similar?
                            
                                7-Zip: Any good tutorials? [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

fork() leaking? Taking longer and longer to fork a simple process

Tags:

c

linux

fork

linux-kernel

superdesk

People also ask

2 Answers

superdesk

J_COL

Recent Activity

Donate For Us