Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference in buffering of stdout on Linux and Windows

There seems to be a difference in how stdout is buffered on Windows and on Linux when written to console. Consider this small python script:

import time
for i in xrange(10):
    time.sleep(1)
    print "Working" ,

When running this script on Windows we see Workings appearing one after another with a second-long wait in-between. On Linux we have to wait for 10 seconds and then the whole line appears at once.

If we change the last line to print "Working", every line appears individually on Linux as well.

So on Linux, stdout seems to be line-buffered and on Windows not at all. We can switch off the buffering by using the -u-option (in this case the script on Linux has the same behavior as on Windows). The documentation says:

-u Force stdin, stdout and stderr to be totally unbuffered.

So actually, it does not say, that without -u-option stdin and stdout are buffered. And thus my questions:

  1. What is the reason for different behavior on Linux/Windows?
  2. Is there some kind of guarantee, that if redirected to a file, stdout will be buffered, no matter which OS? At least this seems to be the case with Windows and Linux.

My main concern is not (as some answers assume) when the information is flushed, but that if stdout isn't buffered it might be a severe performance hit and one should not rely on it.

Edit: It might be worth noting, that for Python3 the behavior is equal for Linux and Windows (but it is not really surprising, because the behavior is configured explicitly by parameters of the print-method).

like image 855
ead Avatar asked Aug 07 '17 11:08

ead


4 Answers

Assuming you're talking about CPython (likely), this has to do with the behaviour of the underlying C implementations.

The ISO C standard mentions (C11 7.21.3 Files /3) three modes:

  • unbuffered (characters appear as soon as possible);
  • fully buffered (characters appear when the buffer is full); and
  • line buffered (characters appear on newline output).

There are other triggers that cause the characters to appear (such as buffer filling up even if no newline is output, requesting input under some circumstances, or closing the stream) but they're not important in the context of your question.

What is important is 7.21.3 Files /7 in that same standard:

As initially opened, the standard error stream is not fully buffered; the standard input and standard output streams are fully buffered if and only if the stream can be determined not to refer to an interactive device.

Note the wiggle room there. Standard output can either be line buffered or unbuffered unless the implementation knows for sure it's not an interactive device.

In this case (the console), it is an interactive device so the implementation is not permitted to use unbuffered. It is, however allowed to select either of the other two modes which is why you're seeing the difference.

Unbuffered output would see the messages appear as soon as you output them (a la your Windows behaviour). Line-buffered would delay until output of a newline character (your Linux behaviour).

If you really want to ensure your messages are flushed regardless of mode, just flush them yourself:

import time, sys
for i in xrange(10):
    time.sleep(1)
    print "Working",
    sys.stdout.flush()
print

In terms of guaranteeing that output will be buffered when redirecting to a file, that would be covered in the quotes from the standard I've already shown. If the stream can be determined to be using a non-interactive device, it will be fully buffered. That's not an absolute guarantee since it doesn't state how that's determined but I'd be surprised if any implementation couldn't figure that out.

In any case, you can test specific implementations just by redirecting the output and monitoring the file to see if it flushes once per output or at the end.

like image 79
paxdiablo Avatar answered Oct 08 '22 05:10

paxdiablo


The behavior differs because the buffering is generally unspecified, which means implementations can do whatever they want. And, it means that implementations can change at any time, or vary in undocumented ways, possibly even on the same platform.

For example, if you print a "long enough" string on Linux, with no newline (\n), it will likely be written through as if it had a newline (because it exceeds the buffer). You may also find the buffer size varies between stdout, pipes, and files.

It's really bad to depend on unspecified behavior, so use flush() when you really need the bytes to be written.

And if you need to control buffering (e.g. for performance reasons), then you need to implement your own buffering on top of write() and flush(). It's pretty straightforward to do, and that gives you complete control over how and when bytes are actually written.

like image 32
payne Avatar answered Oct 08 '22 04:10

payne


Windows and Linux have very different console output drivers. In Linux, the output is being buffered until the \n occurs in the case of your program.

If you want to force the buffer to flush manually use

import sys
sys.stdout.flush()
like image 24
Lucas Hendren Avatar answered Oct 08 '22 04:10

Lucas Hendren


This already has answers elsewhere, but I will summarize below.

  1. The reason for the different behavior on windows versus linux is because of the way the print command is implemented (as noted in the comment by eryksun). You can get more information about that over here and here.

  2. This can be remedied in many ways in python. More on that over here.

like image 27
AkshayDandekar Avatar answered Oct 08 '22 05:10

AkshayDandekar