I've heard that mixing forking and threading in a program could be very problematic, often resulting with mysterious behavior, especially when dealing with shared resources, such as locks, pipes, file descriptors. But I never fully understand what exactly the dangers are and when those could happen. It would be great if someone with expertise in this area could explain a bit more in detail what pitfalls are and what needs to be care when programming in a such environment.
For example, if I want to write a server that collects data from various different resources, one solution I've thought is to have the server spawns a set of threads, each popen to call out another program to do the actual work, open pipes to get the data back from the child. Each of these threads responses for its own work, no data interexchange in b/w them, and when the data is collected, the main thread has a queue and these worker threads will just put the result in the queue. What could go wrong with this solution?
Please do not narrow your answer by just "answering" my example scenario. Any suggestions, alternative solutions, or experiences that are not related to the example but helpful to provide a clean design would be great! Thanks!
No. A fork creates a new process with his own thread(s), copies the file descriptor and the virtual memory. A child process does NOT share the same memory with his father.
Fork is nothing but a new process that looks exactly like the old or the parent process but still it is a different process with different process ID and having it's own memory. Threads are light-weight process which have less overhead.
fork creates a new process. The parent of a process is another process, not a thread. So the parent of the new process is the old process. Note that the child process will only have one thread because fork only duplicates the (stack for the) thread that calls fork .
The fork() system call in UNIX causes creation of a new process. The new process (called the child process) is an exact copy of the calling process (called the parent process) except for the following: The child process has a unique process ID.
The problem with forking when you do have some threads running is that the fork only copies the CPU state of the one thread that called it. It's as if all of the other threads just died, instantly, wherever they may be.
The result of this is locks aren't released, and shared data (such as the malloc heap) may be corrupted.
pthread does offer a pthread_atfork function - in theory, you could take every lock in the program before forking, release them after, and maybe make it out alive - but it's risky, because you could always miss one. And, of course, the stacks of the other threads won't be freed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With