Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should I set stdout and stdin to be unbuffered in C?

Because of stdin and stdout buffering sometimes printf, scanf and getchar are not executed. I usually flush output buffer using fflush(stdout) but code can become very unreadable because of that. If I set stdin and stdout unbuffered using setbuf(stdin, NULL) and setbuf(stdout, NULL) will I make my program perform better or worse?

like image 461
Borisav Živanović Avatar asked Feb 06 '16 10:02

Borisav Živanović


People also ask

Is stdout buffered in C?

In C, file output is block buffered. Output to stdout is line buffered. The stderr device is unbuffered.

Is stdin buffered?

Default Buffering modes: stdin is buffered (line buffering doesn't affect stdin) stdout is buffered (line buffered if connected to a terminal) stderr is unbuffered.

How do I make stdout unbuffered?

Use fflush(FILE *stream) with stdout as the parameter. This does not turn off buffering, it flushes the buffer once.

Why is stdout buffered?

The main reason why buffering exists is to amortize the cost of these system calls. This is primarily important when the program is doing a lot of these write calls, as the amortization is only effective when the system call overhead is a significant percentage of the program's time.


1 Answers

Making stdin or stdout completely unbuffered can make your program perform worse if it handles large quantities of input / output from and to files. Most I/O requests will be broken down as system calls on a byte by byte basis.

Note that buffering does not cause printf, scanf and getchar to not be executed: printf output to the final destination can just be delayed, so the input operation via scanf or getchar may occur without a prompt.

Note also that setting intput as unbuffered may not be effective from the terminal because the terminal itself performs its own buffering controlled via stty or ioctl.

Most C libraries have a hack that causes stdout to be flushed when reading from stdin requires getting data from the system, but this behavior is not specified in the C Standard, so some libraries do not implement it. It is safe to add calls to fflush(stdout); before input operations, and after transitory messages such as progress meters. For most purposes, it is best to let the C startup determine the appropriate buffering strategy depending upon the type of system handle associated with the stdin and stdout streams. The common default is line buffered for devices and fully buffered with a size of BUFSIZ for files.

To get an idea of the potential performance hit, compile this naive ccat program:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[]) {
    int c, size;

    if (argc > 1) {
        if (!strcmp(argv[1], "BUFSIZ"))
            size = BUFSIZ;
        else
            size = strtol(argv[1], NULL, 0);

        if (size == 0) {
            /* make stdin and stdout unbuffered */
            setvbuf(stdin, NULL, _IONBF, 0);
            setvbuf(stdout, NULL, _IONBF, 0);
        } else
        if (size > 0) {
            /* make stdin and stdout fully buffered */
            setvbuf(stdin, NULL, _IOFBF, size);
            setvbuf(stdout, NULL, _IOFBF, size);
        } else {
            /* make stdin and stdout line buffered */
            setvbuf(stdin, NULL, _IOLBF, -size);
            setvbuf(stdout, NULL, _IOLBF, -size);
        }
    }
    while ((c = getchar()) != EOF) {
        putchar(c);
    }
    return 0;
}

Time program execution copying a large file several times to minimize file caching side effects.

On a Debian linux box I get these timings for a 3.8MB text file:

chqrlie@linux:~/dev/stackoverflow$ time wc w
 396684  396684 3755392 w
real    0m0.072s
user    0m0.068s
sys     0m0.000s

chqrlie@linux:~/dev/stackoverflow$ time cat < w > ww
real    0m0.008s
user    0m0.000s
sys     0m0.004s

chqrlie@linux:~/dev/stackoverflow$ time ./ccat < w > ww
real    0m0.060s
user    0m0.056s
sys     0m0.000s

chqrlie@linux:~/dev/stackoverflow$ time ./ccat 0x100000 < w > ww
real    0m0.060s
user    0m0.058s
sys     0m0.000s

chqrlie@linux:~/dev/stackoverflow$ time ./ccat 0 < w > ww
real    0m5.326s
user    0m0.632s
sys     0m4.684s

chqrlie@linux:~/dev/stackoverflow$ time ./ccat -0x1000 < w > ww
real    0m0.533s
user    0m0.104s
sys     0m0.428s

As you can see:

  • setting stdin and stdout to unbuffered causes the program to slow down by a factor of almost 100,
  • using line buffering slows down by a factor of 10 (because lines are short, 9 to 10 bytes on average)
  • using a larger buffer does not show any improvement, the timing difference is not significant,
  • the naive implementation is quite fast, but the real cat utility uses faster APIs for 6 times faster execution times.

As a conclusion: do not set stdin and stdout to unbuffered, it will impact performance significantly for even moderately large files.

like image 115
chqrlie Avatar answered Nov 02 '22 03:11

chqrlie