Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strange behavior performing library functions on STDOUT and STDIN's file descriptors

Throughout my years as a C programmer, I've always been confused about the standard stream file descriptors. Some places, like Wikipedia[1], say:

In the C programming language, the standard input, output, and error streams are attached to the existing Unix file descriptors 0, 1 and 2 respectively.

This is backed up by unistd.h:

/* Standard file descriptors.  */
#define STDIN_FILENO    0       /* Standard input.  */
#define STDOUT_FILENO   1       /* Standard output.  */
#define STDERR_FILENO   2       /* Standard error output.  */

However, this code (on any system):

write(0, "Hello, World!\n", 14);

Will print Hello, World! (and a newline) to STDOUT. This is odd because STDOUT's file descriptor is supposed to be 1. write-ing to file descriptor 1 also prints to STDOUT.

Performing an ioctl on file descriptor 0 changes standard input[2], and on file descriptor 1 changes standard output. However, performing termios functions on either 0 or 1 changes standard input[3][4].

I'm very confused about the behavior of file descriptors 1 and 0. Does anyone know why:

  • writeing to 1 or 0 writes to standard output?
  • Performing ioctl on 1 modifies standard output and on 0 modifies standard input, but performing tcsetattr/tcgetattr on either 1 or 0 works for standard input?
like image 619
MD XF Avatar asked Aug 06 '17 21:08

MD XF


2 Answers

I guess it is because in my Linux, both 0 and 1 are by default opened with read/write to the /dev/tty which is the controlling terminal of the process. So indeed it is possible to even read from stdout.

However this breaks as soon as you pipe something in or out:

#include <unistd.h>
#include <errno.h>
#include <stdio.h>

int main() {
    errno = 0;
    write(0, "Hello world!\n", 14);
    perror("write");
}

and run with

% ./a.out 
Hello world!
write: Success
% echo | ./a.out
write: Bad file descriptor

termios functions always work on the actual underlying terminal object, so it doesn't matter whether 0 or 1 is used for as long as it is opened to a tty.

like image 160

Let's start by reviewing some of the key concepts involved:

  • File description

    In the operating system kernel, each file, pipe endpoint, socket endpoint, open device node, and so on, has a file description. The kernel uses these to keep track of the position in the file, the flags (read, write, append, close-on-exec), record locks, and so on.

    The file descriptions are internal to the kernel, and do not belong to any process in particular (in typical implementations).
     

  • File descriptor

    From the process viewpoint, file descriptors are integers that identify open files, pipes, sockets, FIFOs, or devices.

    The operating system kernel keeps a table of descriptors for each process. The file descriptor used by the process is simply an index to this table.

    The entries to in the file descriptor table refer to a kernel file description.

Whenever a process uses dup() or dup2() to duplicate a file descriptor, the kernel only duplicates the entry in the file descriptor table for that process; it does not duplicate the file description it keeps to itself.

When a process forks, the child process gets its own file descriptor table, but the entries still point to the exact same kernel file descriptions. (This is essentially a shallow copy, will all file descriptor table entries being references to file descriptions. The references are copied; the referred to targets remain the same.)

When a process sends a file descriptor to another process via an Unix Domain socket ancillary message, the kernel actually allocates a new descriptor on the receiver, and copies the file description the transferred descriptor refers to.

It all works very well, although it is a bit confusing that "file descriptor" and "file description" are so similar.

What does all that have to do with the effects the OP is seeing?

Whenever new processes are created, it is common to open the target device, pipe, or socket, and dup2() the descriptor to standard input, standard output, and standard error. This leads to all three standard descriptors referring to the same file description, and thus whatever operation is valid using one file descriptor, is valid using the other file descriptors, too.

This is most common when running programs on the console, as then the three descriptors all definitely refer to the same file description; and that file description describes the slave end of a pseudoterminal character device.

Consider the following program, run.c:

#define  _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <errno.h>

static void wrerrp(const char *p, const char *q)
{
    while (p < q) {
        ssize_t  n = write(STDERR_FILENO, p, (size_t)(q - p));
        if (n > 0)
            p += n;
        else
            return;
    }
}

static inline void wrerr(const char *s)
{
    if (s)
        wrerrp(s, s + strlen(s));
}

int main(int argc, char *argv[])
{
    int fd;

    if (argc < 3) {
        wrerr("\nUsage: ");
        wrerr(argv[0]);
        wrerr(" FILE-OR-DEVICE COMMAND [ ARGS ... ]\n\n");
        return 127;
    }

    fd = open(argv[1], O_RDWR | O_CREAT, 0666);
    if (fd == -1) {
        const char *msg = strerror(errno);
        wrerr(argv[1]);
        wrerr(": Cannot open file: ");
        wrerr(msg);
        wrerr(".\n");
        return 127;
    }

    if (dup2(fd, STDIN_FILENO) != STDIN_FILENO ||
        dup2(fd, STDOUT_FILENO) != STDOUT_FILENO) {
        const char *msg = strerror(errno);
        wrerr("Cannot duplicate file descriptors: ");
        wrerr(msg);
        wrerr(".\n");
        return 126;
    }
    if (dup2(fd, STDERR_FILENO) != STDERR_FILENO) {
        /* We might not have standard error anymore.. */
        return 126;
    }

    /* Close fd, since it is no longer needed. */
    if (fd != STDIN_FILENO && fd != STDOUT_FILENO && fd != STDERR_FILENO)
        close(fd);

    /* Execute the command. */
    if (strchr(argv[2], '/'))
        execv(argv[2], argv + 2);  /* Command has /, so it is a path */
    else
        execvp(argv[2], argv + 2); /* command has no /, so it is a filename */

    /* Whoops; failed. But we have no stderr left.. */
    return 125;
}

It takes two or more parameters. The first parameter is a file or device, and the second is the command, with the rest of the parameters supplied to the command. The command is run, with all three standard descriptors redirected to the file or device named in the first parameter. You can compile the above with gcc using e.g.

gcc -Wall -O2 run.c -o run

Let's write a small tester utility, report.c:

#define  _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>

int main(int argc, char *argv[])
{
    char    buffer[16] = { "\n" };
    ssize_t result;
    FILE   *out;

    if (argc != 2) {
        fprintf(stderr, "\nUsage: %s FILENAME\n\n", argv[0]);
        return EXIT_FAILURE;
    }

    out = fopen(argv[1], "w");
    if (!out)
        return EXIT_FAILURE;

    result = write(STDIN_FILENO, buffer, 1);
    if (result == -1) {
        const int err = errno;
        fprintf(out, "write(STDIN_FILENO, buffer, 1) = -1, errno = %d (%s).\n", err, strerror(err));
    } else {
        fprintf(out, "write(STDIN_FILENO, buffer, 1) = %zd%s\n", result, (result == 1) ? ", success" : "");
    }

    result = read(STDOUT_FILENO, buffer, 1);
    if (result == -1) {
        const int err = errno;
        fprintf(out, "read(STDOUT_FILENO, buffer, 1) = -1, errno = %d (%s).\n", err, strerror(err));
    } else {
        fprintf(out, "read(STDOUT_FILENO, buffer, 1) = %zd%s\n", result, (result == 1) ? ", success" : "");
    }

    result = read(STDERR_FILENO, buffer, 1);
    if (result == -1) {
        const int err = errno;
        fprintf(out, "read(STDERR_FILENO, buffer, 1) = -1, errno = %d (%s).\n", err, strerror(err));
    } else {
        fprintf(out, "read(STDERR_FILENO, buffer, 1) = %zd%s\n", result, (result == 1) ? ", success" : "");
    }

    if (ferror(out))
        return EXIT_FAILURE;
    if (fclose(out))
        return EXIT_FAILURE;

    return EXIT_SUCCESS;
}

It takes exactly one parameter, a file or device to write to, to report whether writing to standard input, and reading from standard output and error work. (We can normally use $(tty) in Bash and POSIX shells, to refer to the actual terminal device, so that the report is visible on the terminal.) Compile this one using e.g.

gcc -Wall -O2 report.c -o report

Now, we can check some devices:

./run /dev/null    ./report $(tty)
./run /dev/zero    ./report $(tty)
./run /dev/urandom ./report $(tty)

or on whatever we wish. On my machine, when I run this on a file, say

./run some-file ./report $(tty)

writing to standard input, and reading from standard output and standard error all work -- which is as expected, as the file descriptors refer to the same, readable and writable, file description.

The conclusion, after playing with the above, is that there is no strange behaviour here at all. It all behaves exactly as one would expect, if file descriptors as used by processes are simply references to operating system internal file descriptions, and standard input, output, and error descriptors are duplicates of each other.

like image 2
Nominal Animal Avatar answered Nov 12 '22 20:11

Nominal Animal