Motivation for spawning a new process v thread

Tags:

I understand that if your program has large segments that can be executed in parallel it would be beneficial to spawn new threads when the instances are not bound by a single resource. Example of this would be a web server issuing page requests.

Threads are beneficial from the aspect that inter-thread communication is much less costly and context switching is much faster.

Processes give you more security from the aspect that one process cannot "mess" with another processes' contents, whereas if one thread crashes it is likely all threads will crash within said process.

My question is, what are some examples as to when you would want to use a process (for example by fork() in C)?

I can think of if you have a program that wants to launch another program it would make sense to encapsulate that in a new process, but I feel that I am missing some larger reason for starting a new process.

Specifically, when does it make sense to have one program spawn a new process vs thread?

823

asked Apr 30 '11 04:04

Bob

1 Answers

Main reason for using processes is so that the process can crash or go crazy, and the OS will limit the effect that this has on other processes. So for example Firefox has recently started running plugins in separate processes, IIRC Chrome runs different pages in different processes, and web servers for a long time have handled individual requests in separate processes.

There are a few different ways in which OSes apply limits:

Crashes - as you note, if a thread crashes it generally takes down the whole process. This motivates the browser process boundaries: browsers and browser plugins are complex bits of code subject to constant attack, so it makes sense to take unusual precautions.
Resource limits. If a thread in your process opens a lot of files, allocates a lot of memory, etc, then it affects you. Another process needn't, because it can be limited separately. So each request in a web server might be more limited in its resource usage than the server as a whole, because you want your server to serve multiple requests simultaneously without any one remote user hogging resources.
Capabilities. Varies by OS, but just for example you can run a process in a chroot jail to ensure that it doesn't modify or read files it shouldn't, no matter how vulnerable your code is to exploits. For another example, SymbianOS has an explicit list of permissions to do various things with the system ("read user phonebook", "write user phonebook", "decrypt DRM files" and so on). There's no way to surrender permissions that your process has, so if you want to do something highly sensitive, and then fall back to a low-sensitivity mode, you need a process boundary somewhere. One reason to want to do this is security - unknown code or code that might contain security flaws can be somewhat sandboxed, and a smaller quantity of code that isn't limited can be subjected to increased scrutiny. Another reason is simply to have the OS enforce certain aspects of your design.
Drivers. In general, a device driver controls shared access to a unique system resource. As with capabilities, restricting this access to a single driver process means you can forbid it to all the other processes. For example IIRC TrueCrypt on Windows installs a driver that has enhanced permissions that allow it to register an encrypted container with a drive letter and then act like any other Windows filesystem. The GUI part of the app runs in regular user mode. I'm not sure whether filesystem drivers on Windows actually need an associated process, but device drivers in general might do, so even if this isn't a good example hopefully it gives the idea.

Another potential reason for using processes is that it makes it easier to reason about your code. In multi-threaded code you rely on invariants of all your classes to deduce that access to a particular object is serialized: if your code isn't multi-threaded then you know that it is[*]. It's possible to do this with multi-threaded code as well, of course, just make sure you know what thread "owns" each object, and never access an object from a thread that isn't its owner. Process boundaries enforce this rather than just designing for it. Again, not certain that this is the motivation, but for example the World Community Grid client can use multiple cores. In that mode it runs multiple processes with a completely different task in each, so it has the performance benefits of the additional cores, without any individual task needing to be parallelizable, or the code for any task needing to be thread-safe.

[*] well, as long as it wasn't created in shared memory. You also need to avoid unexpected recursive calls and the like, but that's usually a simpler problem than synchronizing multi-threaded code.

181

answered Oct 12 '22 11:10

Steve Jessop

Related questions
                            
                                How to write a JIT library?
                            
                                How to convert a float to a non standard encoding
                            
                                Concurrency, 4 CUDA Applications competing to get GPU resources
                            
                                Efficient Erlang Port Driver
                            
                                State-of-the-art for embedding scriptable, interactive SVG in Gtk+ applications?
                            
                                Tool for simple modification of elf file?
                            
                                Programmatically get hard drive info on macOS
                            
                                How extract meaningful text from HTML
                            
                                size limit of printf conversion specification
                            
                                HTTP Server Programming
                            
                                combining packed data with aligned memory access
                            
                                Using a SIGINT from Ctrl+C
                            
                                can supervisord be used for memory and CPU usage profiling of a program
                            
                                How do I suppress output while using a dynamic library?
                            
                                Writing a basic traceroute script in C
                            
                                How does the Blue Brain Project (and NEURON software) work?
                            
                                Snapshot of a part of a file in Vim: hide comments and blank lines
                            
                                How read Linux or Mac created file in Windows via C FILE*?
                            
                                2-D array in C, address generation
                            
                                How to use FLAC in iPhone application?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Motivation for spawning a new process v thread

Tags:

c

process

multithreading

Bob

People also ask

1 Answers

Steve Jessop

Recent Activity

Donate For Us