I've tried to look this up, but I'm struggling a bit to understand the relation between the Parent Process and the Child Process immediately after I call fork().
Are they completely separate processes, only associated by the id/parent id? Or do they share memory? For example the 'code' section of each process - is that duplicated so that each process has it's own identical copy, or is that 'shared' in some way so that only one exists?
I hope that makes sense.
In the name of full disclosure this is 'homework related'; while not a direct question from the book, I have a feeling it's mostly academic and, in practice, I probably don't need to know.
When a process calls fork, it is deemed the parent process and the newly created process is its child. After the fork, both processes not only run the same program, but they resume execution as though both had called the system call.
What is a Fork()? In the computing field, fork() is the primary method of process creation on Unix-like operating systems. This function creates a new copy called the child out of the original process, that is called the parent. When the parent process closes or crashes for some reason, it also kills the child process.
Upon successful completion, fork() returns 0 to the child process and returns the process ID of the child process to the parent process. Otherwise, -1 is returned to the parent process, no child process is created, and errno is set to indicate the error.
As it appears to the process, the entire memory is duplicated.
In reality, it uses "copy on write" system. The first time either process changes its memory after fork(), a separate copy is made of the modified page (usually 4kB).
Usually the code segment of a process is not modified, in which case it remains shared.
Logically, a fork creates an identical copy of the original process that is largely independent of the original. For performance reasons, memory is shared with copy-on-write semantics, which means that unmodified memory (such as code) remains shared.
File descriptors are duplicated, so that the forked process could, in principle, take over a database connection on behalf of the parent (or they could even jointly communicate with the database if the programmer is a bit twisted). More commonly, this is used to set up pipes between processes so you can write find -name '*.c' | xargs grep fork
.
A bunch of other stuff is shared. See here for details.
One important omission is threads — the child process only inherits the thread that called fork()
. This causes no end of trouble in multithreaded programs, since the status of mutexes, etc., that were locked in the parent is implementation-specific (and don't forget that malloc()
and printf()
use locks internally). The only safe thing to do in the child after fork()
returns is to call execve()
as soon as possible, and even then you have to be cautious with file descriptors. See here for the full horror story.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With