I have a simple C program to time process startup (I'd rather not post the full code as it's an active school assignment). My main function looks like this:
int main(void) {
int i;
for (i = 0; i < 5; i++) {
printf("%lf\n", sample_time());
}
exit(0);
}
sample_time()
is a function that times how long it takes to fork a new process and returns the result in seconds as a double
. The part of sample_time()
that forks:
double sample_time() {
// timing stuff
if (!fork()) exit(0); // immediately close new process
// timing stuff
return a_sample_time;
}
As expected, running the program, times
, in the terminal outputs 5 numbers like so:
$ ./times
0.000085
0.000075
0.000079
0.000071
0.000078
However, trying to redirect this into a file (or anywhere else) in a Unix terminal produces unexpected results.
For example, ./times > times.out
creates a file with fifteen numbers. Additionally, ./times | wc -l
outputs 15
, confirming the earlier result. Running ./times | cat
, I again see fifteen numbers, more than five of which are distinct.
Does anyone know what on earth can cause something like this? I'm out of ideas.
./times
!= ./times | cat
. Wat.
Fact 1 - When stdout is connected to a TTY it is line buffered. When it's connected to a file or a pipeline it is full buffered. This means it's only flushed every 8KB, say, rather than every line.
Fact 2 - Forked processes have duplicate copies of in-memory data. This includes stdio's output buffers if the data hasn't been flushed yet.
Fact 3 - Calling exit()
causes stdio's output buffers to be flushed before the program exits.
When your program prints to the terminal its output is line buffered. Each printf()
call that ends with \n
immediately prints. This means each line is printed and the in-memory output buffer is emptied before fork()
runs.
Result: 5 lines of output.
When libc sees that stdout isn't connected to a TTY it switches to a more efficient full buffering strategy. This causes output to be buffered until 4KB worth have accumulated. That means the output from the printf()
s is saved in memory, and calls to write()
are deferred.
if (!fork()) exit(0);
After forking, the child process has a copy of the buffered output. The exit()
call then causes that buffer to be flushed. This doesn't affect the parent process, though. Its output is still buffered.
Then when the second line of output is printed, it has two lines buffered. The next child process forks, exits, and prints those two lines. The parent retains its two lines of output, and so on.
Result: The child processes print 0, 1, 2, 3, and 4 lines of output. The main program prints 5 when it finally exits and flushes its output. 0 + 1 + 2 + 3 + 4 + 5 = 15. 15 lines of output instead of 5!
Call _Exit()
instead of exit()
. The function _Exit()
is like exit()
, but does not call any functions registered with atexit()
. This would be my preferred solution.
Explicitly set stdout to be line buffered: setvbuf(stdout, NULL, _IOLBF, 0);
Call fflush(stdout)
after each printf
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With