What is the reason for performing a double fork when creating a daemon?

People also ask

What is forking a daemon?

In a Unix environment, the parent process of a daemon is often (but not always) the init process (PID=1). Processes usually become daemons by forking a child process and then having their parent process immediately exit, thus causing init to adopt the child process.

How does daemon process work?

A daemon process is a background process that is not under the direct control of the user. This process is usually started when the system is bootstrapped and it terminated with the system shut down. Usually the parent process of the daemon process is the init process.

I was trying to understand the double fork and stumbled upon this question here. After a lot of research this is what I figured out. Hopefully it will help clarify things better for anyone who has the same question.

In Unix every process belongs to a group which in turn belongs to a session. Here is the hierarchy…

Session (SID) → Process Group (PGID) → Process (PID)

The first process in the process group becomes the process group leader and the first process in the session becomes the session leader. Every session can have one TTY associated with it. Only a session leader can take control of a TTY. For a process to be truly daemonized (ran in the background) we should ensure that the session leader is killed so that there is no possibility of the session ever taking control of the TTY.

I ran Sander Marechal's python example daemon program from this site on my Ubuntu. Here are the results with my comments.

1. `Parent`    = PID: 28084, PGID: 28084, SID: 28046
2. `Fork#1`    = PID: 28085, PGID: 28084, SID: 28046
3. `Decouple#1`= PID: 28085, PGID: 28085, SID: 28085
4. `Fork#2`    = PID: 28086, PGID: 28085, SID: 28085

Note that the process is the session leader after Decouple#1, because it's PID = SID. It could still take control of a TTY.

Note that Fork#2 is no longer the session leader PID != SID. This process can never take control of a TTY. Truly daemonized.

I personally find terminology fork-twice to be confusing. A better idiom might be fork-decouple-fork.

Additional links of interest:

Unix processes - http://www.win.tue.nl/~aeb/linux/lk/lk-10.html

Strictly speaking, the double-fork has nothing to do with re-parenting the daemon as a child of init. All that is necessary to re-parent the child is that the parent must exit. This can be done with only a single fork. Also, doing a double-fork by itself doesn't re-parent the daemon process to init; the daemon's parent must exit. In other words, the parent always exits when forking a proper daemon so that the daemon process is re-parented to init.

So why the double fork? POSIX.1-2008 Section 11.1.3, "The Controlling Terminal", has the answer (emphasis added):

The controlling terminal for a session is allocated by the session leader in an implementation-defined manner. If a session leader has no controlling terminal, and opens a terminal device file that is not already associated with a session without using the O_NOCTTY option (see open()), it is implementation-defined whether the terminal becomes the controlling terminal of the session leader. If a process which is not a session leader opens a terminal file, or the O_NOCTTY option is used on open(), then that terminal shall not become the controlling terminal of the calling process.

This tells us that if a daemon process does something like this ...

int fd = open("/dev/console", O_RDWR);

... then the daemon process might acquire /dev/console as its controlling terminal, depending on whether the daemon process is a session leader, and depending on the system implementation. The program can guarantee that the above call will not acquire a controlling terminal if the program first ensures that it is not a session leader.

Normally, when launching a daemon, setsid is called (from the child process after calling fork) to dissociate the daemon from its controlling terminal. However, calling setsid also means that the calling process will be the session leader of the new session, which leaves open the possibility that the daemon could reacquire a controlling terminal. The double-fork technique ensures that the daemon process is not the session leader, which then guarantees that a call to open, as in the example above, will not result in the daemon process reacquiring a controlling terminal.

The double-fork technique is a bit paranoid. It may not be necessary if you know that the daemon will never open a terminal device file. Also, on some systems it may not be necessary even if the daemon does open a terminal device file, since that behavior is implementation-defined. However, one thing that is not implementation-defined is that only a session leader can allocate the controlling terminal. If a process isn't a session leader, it can't allocate a controlling terminal. Therefore, if you want to be paranoid and be certain that the daemon process cannot inadvertently acquire a controlling terminal, regardless of any implementation-defined specifics, then the double-fork technique is essential.

Looking at the code referenced in the question, the justification is:

Fork a second child and exit immediately to prevent zombies. This causes the second child process to be orphaned, making the init process responsible for its cleanup. And, since the first child is a session leader without a controlling terminal, it's possible for it to acquire one by opening a terminal in the future (System V- based systems). This second fork guarantees that the child is no longer a session leader, preventing the daemon from ever acquiring a controlling terminal.

So it is to ensure that the daemon is re-parented onto init (just in case the process kicking off the daemon is long lived), and removes any chance of the daemon reacquiring a controlling tty. So if neither of these cases apply, then one fork should be sufficient. "Unix Network Programming - Stevens" has a good section on this.

Taken from Bad CTK:

"On some flavors of Unix, you are forced to do a double-fork on startup, in order to go into daemon mode. This is because single forking isn’t guaranteed to detach from the controlling terminal."

According to "Advanced Programming in the Unix Environment", by Stephens and Rago, the second fork is more a recommendation, and it is done to guarantee that the daemon does not acquire a controlling terminal on System V-based systems.

One reason is that the parent process can immediately wait_pid() for the child, and then forget about it. When then grand-child dies, it's parent is init, and it will wait() for it - and taking it out of the zombie state.

The result is that the parent process doesn't need to be aware of the forked children, and it also makes it possible to fork long running processes from libs etc.

The daemon() call has the parent call _exit() if it succeeds. The original motivation may have been to allow the parent to do some extra work while the child is daemonizing.

It may also be based on a mistaken belief that it's necessary in order to ensure the daemon has no parent process and is reparented to init - but this will happen anyway once the parent dies in the single fork case.

So I suppose it all just boils down to tradition in the end - a single fork is sufficient as long as the parent dies in short order anyway.

Related questions
                            
                                Ignoring NaNs with str.contains
                            
                                How to read first N lines of a file?
                            
                                Suppress Scientific Notation in Numpy When Creating Array From Nested List
                            
                                How to set the timezone in Django?
                            
                                Where's my JSON data in my incoming Django request?
                            
                                How to condense if/else into one line in Python? [duplicate]
                            
                                How to remove all characters after a specific character in python?
                            
                                python list by value not by reference [duplicate]
                            
                                Why is TensorFlow 2 much slower than TensorFlow 1?
                            
                                How to scp in Python?
                            
                                Looping over a list in Python
                            
                                Web scraping with Python [closed]
                            
                                Python Threading String Arguments
                            
                                How to efficiently compare two unordered lists (not sets) in Python?
                            
                                Generating matplotlib graphs without a running X server [duplicate]
                            
                                Python extract pattern matches
                            
                                How do I build a numpy array from a generator?
                            
                                "TypeError: (Integer) is not JSON serializable" when serializing JSON in Python?
                            
                                Pandas: sum DataFrame rows for given columns
                            
                                Using multiple arguments for string formatting in Python (e.g., '%s ... %s')

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the reason for performing a double fork when creating a daemon?

Tags:

python

unix

daemon

People also ask

Recent Activity

Donate For Us