I read in the 3rd chapter of the "Linux Kernel Development, Second Edition" by Robert Love (ISBN:0-672-32720-1) that the <code>clone</code> system call is used to create a thread in Linux. Now the syntax of <code>clone</code> is such that a starting routine/function address is needed to be passed to it. But then on the same page it is written that <code>fork</code> calls <code>clone</code> internally. So my question is, how do child process created by <code>fork</code> starts running the part of code which is after <code>fork</code> call, i.e. how does it not require a function as starting point? If the links I provided have incorrect info, then please guide me to some better links/resources.

For questions like this, always read the source code. From glibc's <code>nptl/sysdeps/unix/sysv/linux/fork.c</code> (GitHub) (<code>nptl</code> = native Posix threads for Linux) we can find the implementation of <code>fork()</code>, which is definitely not a syscall, we can see that the magic happens inside the <code>ARCH_FORK</code> macro, which is defined as an inline call to <code>clone()</code> in <code>nptl/sysdeps/unix/sysv/linux/x86_64/fork.c</code> (GitHub). But wait, no function or stack pointer is passed to this version of <code>clone()</code>! So, what is going on here? Let's look at the implementation of <code>clone()</code> in glibc, then. It's in <code>sysdeps/unix/sysv/linux/x86_64/clone.S</code> (GitHub). You can see that what it does is it saves the function pointer on the child's stack, calls the clone syscall, and then the new process will read pop the function off the stack and then call it. So it works like this: <pre class="prettyprint"><code>clone(void (*fn)(void *), void *stack_pointer) { push fn onto stack_pointer syscall_clone() if (child) { pop fn off of stack fn(); exit(); } } </code></pre> And <code>fork()</code> is... <pre class="prettyprint"><code>fork() { ... syscall_clone(); ... } </code></pre> <h3>Summary</h3> The actual <code>clone()</code> syscall does not take a function argument, it just continues from the return point, just like <code>fork()</code>. So both the <code>clone()</code> and <code>fork()</code> library functions are wrappers around the <code>clone()</code> syscall. <h3>Documentation</h3> My copy of the manual is somewhat more upfront about the fact that <code>clone()</code> is both a library function and a system call. However, I do find it somewhat misleading that <code>clone()</code> is found in section 2, rather than both section 2 and section 3. From the man page: <pre class="prettyprint"><code>#include <sched.h> int clone(int (*fn)(void *), void *child_stack, int flags, void *arg, ... /* pid_t *ptid, struct user_desc *tls, pid_t *ctid */ ); /* Prototype for the raw system call */ long clone(unsigned long flags, void *child_stack, void *ptid, void *ctid, struct pt_regs *regs); </code></pre> And, <blockquote> This page describes both the glibc <code>clone()</code> wrapper function and the underlying system call on which it is based. The main text describes the wrapper function; the differences for the raw system call are described toward the end of this page. </blockquote> Finally, <blockquote> The raw <code>clone()</code> system call corresponds more closely to <code>fork(2)</code> in that execution in the child continues from the point of the call. As such, the fn and arg arguments of the <code>clone()</code> wrapper function are omitted. Furthermore, the argument order changes. </blockquote>

Is it true that fork() calls clone() internally?

Tags:

c

linux

operating-system

system-calls

I read in the 3rd chapter of the "Linux Kernel Development, Second Edition" by Robert Love (ISBN:0-672-32720-1) that the clone system call is used to create a thread in Linux. Now the syntax of clone is such that a starting routine/function address is needed to be passed to it.

But then on the same page it is written that fork calls clone internally. So my question is, how do child process created by fork starts running the part of code which is after fork call, i.e. how does it not require a function as starting point?

If the links I provided have incorrect info, then please guide me to some better links/resources.

599

asked Sep 19 '13 20:09

Don't You Worry Child

1 Answers

For questions like this, always read the source code.

From glibc's nptl/sysdeps/unix/sysv/linux/fork.c (GitHub) (nptl = native Posix threads for Linux) we can find the implementation of fork(), which is definitely not a syscall, we can see that the magic happens inside the ARCH_FORK macro, which is defined as an inline call to clone() in nptl/sysdeps/unix/sysv/linux/x86_64/fork.c (GitHub). But wait, no function or stack pointer is passed to this version of clone()! So, what is going on here?

Let's look at the implementation of clone() in glibc, then. It's in sysdeps/unix/sysv/linux/x86_64/clone.S (GitHub). You can see that what it does is it saves the function pointer on the child's stack, calls the clone syscall, and then the new process will read pop the function off the stack and then call it.

So it works like this:

clone(void (*fn)(void *), void *stack_pointer) {     push fn onto stack_pointer     syscall_clone()     if (child) {         pop fn off of stack         fn();         exit();     } }

And fork() is...

fork() {     ...     syscall_clone();     ... }

Summary

The actual clone() syscall does not take a function argument, it just continues from the return point, just like fork(). So both the clone() and fork() library functions are wrappers around the clone() syscall.

Documentation

My copy of the manual is somewhat more upfront about the fact that clone() is both a library function and a system call. However, I do find it somewhat misleading that clone() is found in section 2, rather than both section 2 and section 3. From the man page:

#include <sched.h>  int clone(int (*fn)(void *), void *child_stack,           int flags, void *arg, ...           /* pid_t *ptid, struct user_desc *tls, pid_t *ctid */ );  /* Prototype for the raw system call */  long clone(unsigned long flags, void *child_stack,           void *ptid, void *ctid,           struct pt_regs *regs);

And,

This page describes both the glibc clone() wrapper function and the underlying system call on which it is based. The main text describes the wrapper function; the differences for the raw system call are described toward the end of this page.

Finally,

The raw clone() system call corresponds more closely to fork(2) in that execution in the child continues from the point of the call. As such, the fn and arg arguments of the clone() wrapper function are omitted. Furthermore, the argument order changes.

138

answered Sep 22 '22 18:09

Dietrich Epp

Related questions
                            
                                Boolean values as 8 bit in compilers. Are operations on them inefficient?
                            
                                Easy rule to read complicated const declarations?
                            
                                malloc: *** error: incorrect checksum for freed object - object was probably modified after being freed
                            
                                If free() knows the length of my array, why can't I ask for it in my own code?
                            
                                How do I use mqueue in a c program on a Linux based system?
                            
                                Multiple directories under CMake
                            
                                pthread_cond_wait versus semaphore
                            
                                Combine static libraries on Apple
                            
                                fcntl, lockf, which is better to use for file locking?
                            
                                Create string with specified number of characters
                            
                                Why don't multiple decrement operators work in C when they work in C++?
                            
                                How to wrap a function with variable length arguments?
                            
                                Does using large libraries inherently make slower code?
                            
                                forward declaration of a struct in C?
                            
                                How to compare ends of strings in C?
                            
                                How do I execute a file in Cygwin?
                            
                                Get a timestamp in C in microseconds?
                            
                                Casting one C structure into another
                            
                                Line by line c - c++ code debugging in Linux ubuntu [closed]
                            
                                What is the difference between static and extern in C?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With