Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's so special about file descriptor 3 on linux?

Tags:

c++

c

linux

macos

I'm working on a server application that's going to work on Linux and Mac OS X. It goes like this:

  • start main application
  • fork of the controller process
  • call lock_down() in the controller process
  • terminate main application
  • the controller process then forks again, creating a worker process
  • eventually the controller keeps forking more worker processes

I can log using several of methods (e.g. syslog or a file) but right now I'm pondering about syslog. The "funny" thing is that no syslog output is ever seen in the controller process unless I include the #ifdef section below.

The worker processes logs flawlessly in Mac OS X and linux with or without the ifdef'ed section below. The controller also logs flawlessly in Mac OS X without the #ifdef'ed section, but on linux the ifdef is needed if I want to see any output into syslog (or the log file for that matter) from the controller process.

So, why is that?

static int
lock_down(void)
{
    struct rlimit rl;
    unsigned int n;
    int fd0;
    int fd1;
    int fd2;

    // Reset file mode mask
    umask(0);

    // change the working directory
    if ((chdir("/")) < 0)
        return EXIT_FAILURE;

    // close any and all open file descriptors
    if (getrlimit(RLIMIT_NOFILE, &rl))
        return EXIT_FAILURE;
    if (RLIM_INFINITY == rl.rlim_max)
        rl.rlim_max = 1024;

    for (n = 0; n < rl.rlim_max; n++) {
#ifdef __linux__        
        if (3 == n) // deep magic...
            continue;
#endif
        if (close(n) && (EBADF != errno))
            return EXIT_FAILURE;
    }

    // attach file descriptors 0, 1 and 2 to /dev/null
    fd0 = open("/dev/null", O_RDWR);
    fd1 = dup2(fd0, 1);
    fd2 = dup2(fd0, 2);
    if (0 != fd0)
        return EXIT_FAILURE;

    return EXIT_SUCCESS;
}

camh was close, but using closelog() was the idea that did the trick so the honor goes to jilles. Something else, aside from closing a file descriptor from under syslogs feet must go on though. To make the code work I added a call to closelog() just before the loop:

closelog();
for (n = 0; n < rl.rlim_max; n++) {
    if (close(n) && (EBADF != errno))
        return EXIT_FAILURE;
}

I was relying on a verbatim understanding of the manual page, saying:

The use of openlog() is optional; it will automatically be called by syslog() if necessary...

I interpreted this as saying that syslog would detect if the file descriptor was closed under it. Apparently it did not. An explicit closelog() on linux was needed to tell syslog that the descriptor was closed.

One more thing that still perplexes me is that not using closelog() prevented the first forked process (the controller) from even opening and using a log file. The following forked processes could use syslog or a log file with no problems. Maybe there are some caching effect in the filesystem that make the first forked process having an unreliable "idea" of which file descriptors are available, while the next set of forked process are sufficiently delayed to not be affected by this?

like image 581
colding Avatar asked Aug 20 '10 22:08

colding


People also ask

What are the three file descriptors used in Linux?

Stdin, stdout, and stderr.

What is the use of file descriptor in Linux?

File Descriptors are non-negative integers that act as an abstract handle to “Files” or I/O resources (like pipes, sockets, or data streams). These descriptors help us interact with these I/O resources and make working with them very easy. Every process has it's own set of file descriptors.

What does a file descriptor of 2 mean?

The file descriptor for standard error is 2. If there is no any directory named as mydir then the output of command will be save to file errorfile.txt. Using "2>" we re-direct the error output to a file named "errorfile.txt" Thus, program output is not cluttered with errors.

Why do we need file descriptor?

A file descriptor is a number that represents an open file in a process. It's a way for the program to remember which file it's manipulating. Opening a file looks for a free number and assigns it to the file in that process's file descriptor table; closing the file removes the entry from the process's descriptor table.


2 Answers

The special aspect of file descriptor 3 is that it will usually be the first file descriptor returned from a system call that allocates a new file descriptor, given that 0, 1 and 2 are usually set up for stdin, stdout and stderr.

This means that if any library function you have called allocates a file descriptor for its own internal purposes in order to perform its functions, it will get fd 3.

The openlog(3) library call will need to open /dev/log to communicate with the syslog daemon. If you subsequently close all file descriptors, you may break the syslog library functions if they are not written in a way to handle that.

like image 71
camh Avatar answered Sep 21 '22 09:09

camh


The way to debug this on Linux is to use strace to trace the actual system calls that are being made; the use of a file descriptor for syslog then becomes obvious:

$ cat syslog_test.c
#include <stdio.h>
#include <syslog.h>

int main(void)
{
    openlog("test", LOG_PID, LOG_LOCAL0);
    syslog(LOG_ERR, "waaaaaah");
    closelog();
    return 0;
}
$ gcc -W -Wall -o syslog_test syslog_test.c
$ strace ./syslog_test
...
socket(PF_FILE, SOCK_DGRAM, 0)          = 3
fcntl64(3, F_SETFD, FD_CLOEXEC)         = 0
connect(3, {sa_family=AF_FILE, path="/dev/log"}, 16) = 0
send(3, "<131>Aug 21 00:47:52 test[24264]"..., 42, MSG_NOSIGNAL) = 42
close(3)                                = 0
exit_group(0)                           = ?
Process 24264 detached
like image 45
Matthew Slattery Avatar answered Sep 23 '22 09:09

Matthew Slattery