Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where is the source for the fork() call in Linux? [closed]

I've been spending quite some time trying to find the source code for the fork() function. I know that most of work done by fork() is done by do_fork() and that can be found in kernel/fork.c. However what I want to see is the source code for the fork() function.

Any ideas where it could be found? I've been going through GCC and Linux source code but still haven't managed to find it.

Edit: I'm trying to find the exact implementation my system is using. As mentioned in the comment and in this Link Its apparently in some wrapper in glibc. Any idea where in glibc I can find the wrapper. I've searched throughly but couldn't find its definition.

like image 416
Programmer123 Avatar asked Jan 19 '16 07:01

Programmer123


People also ask

How does fork system call work in Linux?

Fork system call is used for creating a new process, which is called child process, which runs concurrently with the process that makes the fork() call (parent process). After a new child process is created, both processes will execute the next instruction following the fork() system call.

What library is fork in?

It is usually implemented as a C Standard Library (libc) wrapper to the fork, clone, or other system calls of the kernel. Fork is the primary method of process creation on Unix-like operating systems.

What happens to parent process after fork?

After fork() returns, there are two processes running concurrently. Since both processes have the same call stack at this point, it looks to each processes as if it had just called fork() . In the parent process, the return value of fork() is the PID of the child process.

What system call does Linux use to implement fork?

The syntax of fork() system call in Linux, Ubuntu is as follows: pid_t fork(void); In the syntax the return type is pid_t. When the child process is successfully created, the PID of the child process is returned in the parent process and 0 will be returned to the child process itself.


3 Answers

Taking as reference the x86 platform and the 2.6.23 Linux kernel:

  • Create the test-fork.c file:

    #include <unistd.h>
    
    int main (void)
    {
        fork();
        return 0;
    }
    
  • Compile it with static linking: gcc -O0 -static -Wall test-fork.c -o test-fork

  • Disassemble it: objdump -D -S test-fork > test-fork.dis

  • Open the test-fork.dis file and search for fork:

            fork();
     80481f4:       e8 63 55 00 00          call   804d75c <__libc_fork>
            return 0;
     80481f9:       b8 00 00 00 00          mov    $0x0,%eax
    }
     80481fe:       c9                      leave  
     80481ff:       c3                      ret    
    
  • Then search __libc_fork:

     0804d75c <__libc_fork>:
     804d75c:       55                      push   %ebp
     804d75d:       b8 00 00 00 00          mov    $0x0,%eax
     804d762:       89 e5                   mov    %esp,%ebp
     804d764:       53                      push   %ebx
     804d765:       83 ec 04                sub    $0x4,%esp
     804d768:       85 c0                   test   %eax,%eax
     804d76a:       74 12                   je     804d77e <__libc_fork+0x22>
     804d76c:       c7 04 24 80 e0 0a 08    movl   $0x80ae080,(%esp)
     804d773:       e8 88 28 fb f7          call   0 <_init-0x80480d4>
     804d778:       83 c4 04                add    $0x4,%esp
     804d77b:       5b                      pop    %ebx
     804d77c:       5d                      pop    %ebp
     804d77d:       c3                      ret    
     804d77e:       b8 02 00 00 00          mov    $0x2,%eax
     804d783:       cd 80                   int    $0x80
     804d785:       3d 00 f0 ff ff          cmp    $0xfffff000,%eax
     804d78a:       89 c3                   mov    %eax,%ebx
     804d78c:       77 08                   ja     804d796 <__libc_fork+0x3a>
     804d78e:       89 d8                   mov    %ebx,%eax
     804d790:       83 c4 04                add    $0x4,%esp
     804d793:       5b                      pop    %ebx
     804d794:       5d                      pop    %ebp
     804d795:       c3                      ret    
    

    Notice that on this particular hardware/kernel fork is associated with syscall number 2

  • Download a copy of the Linux kernel: wget http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.23.tar.bz2

  • Open the linux-2.6.23/arch/x86/kernel/syscall_table_32.S file

  • Notice that the syscall number 2 is associated to

    sys_fork:
         .long sys\_fork   /* 2 */
    
  • Open the linux-2.6.23/arch/x86/kernel/process.c file

  • Search for sys_fork:

      asmlinkage int sys_fork(struct pt_regs regs)
      {
              return do_fork(SIGCHLD, regs.esp, &regs, 0, NULL, NULL);
      }
    

    Notice that do_fork() is called only with the SIGCHLD parameter

  • Open the linux-2.6.23/kernel/fork.c file. Here is where do_fork() is defined!

  • do_fork() then calls copy_process():

      /*
       *  Ok, this is the main fork-routine.
       *
       * It copies the process, and if successful kick-starts
       * it and waits for it to finish using the VM if required.
       */
      long do_fork(unsigned long clone_flags,
                    unsigned long stack_start,
                    struct pt_regs *regs,
                    unsigned long stack_size,
                    int __user *parent_tidptr,
                    int __user *child_tidptr)
      {
              struct task_struct *p;
              int trace = 0;
              struct pid *pid = alloc_pid();
              long nr;
    
              if (!pid)
                      return -EAGAIN;
              nr = pid->nr;
              if (unlikely(current->ptrace)) {
                      trace = fork_traceflag (clone_flags);
                      if (trace)
                              clone_flags |= CLONE_PTRACE;
              }
    
              p = copy_process(clone_flags, stack_start, regs, stack_size, \
                               parent_tidptr, child_tidptr, pid);
    
    
           /*
             * Do this prior waking up the new thread - the thread 
             * pointer might get invalid after that point, 
             * if the thread exits quickly.
             */
            if (!IS_ERR(p)) {
                    struct completion vfork;
    
                    if (clone_flags & CLONE_VFORK) {
                            p->vfork_done = &vfork;
                            init_completion(&vfork);
                    }
    
                    if ((p->ptrace & PT_PTRACED) || \
                        (clone_flags & CLONE_STOPPED)) {
                            /*
                             * We'll start up with an immediate SIGSTOP.
                             */
                            sigaddset(&p->pending.signal, SIGSTOP);
                            set_tsk_thread_flag(p, TIF_SIGPENDING);
                    }
    
                    if (!(clone_flags & CLONE_STOPPED))
                            wake_up_new_task(p, clone_flags);
                    else
                            p->state = TASK_STOPPED;
    
                    if (unlikely (trace)) {
                            current->ptrace_message = nr;
                            ptrace_notify ((trace << 8) | SIGTRAP);
                    }
    
                     if (clone_flags & CLONE_VFORK) {
                              freezer_do_not_count();
                              wait_for_completion(&vfork);
                              freezer_count();
                              if (unlikely (current->ptrace & \
                                            PT_TRACE_VFORK_DONE)) {
                                      current->ptrace_message = nr;
                                      ptrace_notify \
                                        ((PTRACE_EVENT_VFORK_DONE << 8) | \
                                          SIGTRAP);
                              }
                      }
              } else {
                      free_pid(pid);
                      nr = PTR_ERR(p);
              }
              return nr;
      }
    
  • The bulk of the work in forking is handled by do_fork(), defined in kernel/fork.c. Operations performed by do_fork():

    • It allocates a new PID for the child by calling alloc_pid()
    • It checks the ptrace field of the parent (i.e. current->ptrace)
      • If it is not zero, the parent process is being traced by another process
    • It calls copy_process(), which sets up the process descriptor and any other kernel data structure required for child's execution

      • Its parameters are the same as do_fork() plus the PID of the child
      • It checks if the flags passed in the clone_flags parameter are compatible
      • It performs additional security checks by invoking security_task_create() and security_task_alloc()
      • It calls dup_task_struct() which creates new kernel stack, thread_info and task_struct structures for the new process.

        • The new values are identical to those of current task
        • At this point child and parent process descriptors are identical
        • It executes the alloc_task_struct() macro to get a task_struct structure for the new process, and stores its address in the tsk local variable.
        • It executes the alloc_thread_info macro to get a free memory area to store the thread_info structure and the Kernel Mode stack of the new process, and saves its address in the ti local variable
        • It copies the contents of the current's process descriptor into the task_struct structure pointed to by tsk, then it sets tsk->thread_info to ti
        • It copies the contents of the current's thread_info descriptor into the structure pointed to by ti, then it sets ti->task to tsk
        • It sets the usage counter of the new process descriptor (i.e. tsk->usage) to 2 to specify that the process descriptor is in use and that the corresponding process is alive (its state is not EXIT_ZOMBIE or EXIT_DEAD)
        • It returns the process descriptor pointer of the new process (i.e. tsk)
      • copy_process() then checks if the maximum amount of processes for the current user has not been exceeded (i.e. greater than `max_threads)

    • It differentiates the child from the parent by clearing or initilizing various fields of the task_struct
    • It calls copy_flags() to update the flags field of the task_struct

      • The PF_SUPERPRIV (which denotes if a task used superuser privileges) and PF_NOFREEZE flags are cleared
      • The PF_FORKNOEXEC flag (which denotes if a task has not called `exec()) is set
      • It calls `init_sigpending() that clears the pending signals
      • Depending on the parameters passed to do_fork(),copy_process()` then duplicates or shares resources
      • Open files
      • Filesystem information
      • Signal handlers
      • Address space
      • It calls sched_fork() which splits the remaining timeslice between parent and child
      • Finally, it returns a pointer to the new child
    • Then, do_fork() adds a pending SIGSTOP signal in case the CLONE_STOPPED flag is set or the child process must be traced (i.e. the PT_PTRACED flag is set in p->ptrace)

    • If the CLONE_STOPPED flag is not set, it invokes the wake_up_new_task() function, which performs the following operations:

      • It adjusts the scheduling parameters of both the parent and the child
      • If the child will run on the same CPU as the parent and parent and child do not share the same set of page tables (i.e. CLONE_VM flag cleared), it then forces the child to run before the parent by inserting it into the parent's runqueue right before the parent. This simple step yields better performance if the child flushes its address space and executes a new program right after the forking. If we let the parent run first, the Copy On Write mechanism would give rise to a series of unnecessary page duplications.
      • Otherwise, if the child will not be run on the same CPU as the parent, or if parent and child share the same set of page tables (i.e. CLONE_VM flag set), it inserts the child in the last position of the parent's runqueue
    • Otherwise, if the CLONE_STOPPED flag is set, it puts the child in the TASK_STOPPED state
    • If the parent process is being traced, it stores the PID of the child in the ptrace_message field of current and invokes ptrace_notify(), which essentially stops the current process and sends a SIGCHLD signal to its parent. The ``grandparent'' of the child is the debugger that is tracing the parent; the SIGCHLD signal notifies the debugger that current has forked a child, whose PID can be retrieved by looking into the current->ptrace_message field.

    • If the CLONE_VFORK flag is specified, it inserts the parent process in a wait queue and suspends it until the child releases its memory address space (that is, until the child either terminates or executes a new program)

  • It terminates by returning the PID of the child.
like image 70
Claudio Avatar answered Nov 15 '22 20:11

Claudio


From http://lxr.free-electrons.com/source/kernel/fork.c#L1787 which is for Linux 4.4:

1787 #ifdef __ARCH_WANT_SYS_FORK
1788 SYSCALL_DEFINE0(fork)
1789 {
1790 #ifdef CONFIG_MMU
1791         return _do_fork(SIGCHLD, 0, 0, NULL, NULL, 0);
1792 #else
1793         /* can not support in nommu mode */
1794         return -EINVAL;
1795 #endif
1796 }
1797 #endif

I believe this is where it defines the fork syscall. Under Linux I believe that the glibc fork() function calls this syscall directly without doing anything else.

like image 42
Penguin Brian Avatar answered Nov 15 '22 19:11

Penguin Brian


Here is the link to glibc file fork.c

like image 26
dlmeetei Avatar answered Nov 15 '22 19:11

dlmeetei