Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python select() behavior is strange

I'm having some trouble understanding the behavior of select.select. Please consider the following Python program:

def str_to_hex(s):
    def dig(n):
        if n > 9:
            return chr(65-10+n)
        else:
            return chr(48+n)
    r = ''
    while len(s) > 0:
        c = s[0]
        s = s[1:]
        a = ord(c) / 16
        b = ord(c) % 16
        r = r + dig(a) + dig(b)
    return r

while True:
    ans,_,_ = select.select([sys.stdin],[],[])
    print ans
    s = ans[0].read(1)
    if len(s) == 0: break
    print str_to_hex(s)

I have saved this to a file "test.py". If invoke it as follows:

echo 'hello' | ./test.py

then I get the expected behavior: select never blocks and all of the data is printed; the program then terminates.

But if I run the program interactively, I get a most undesirable behavior. Please consider the following console session:

$ ./test.py
hello
[<open file '<stdin>', mode 'r' at 0xb742f020>]
68

The program then hangs there; select.select is now blocking again. It is not until I provide more input or close the input stream that the next character (and all of the rest of them) are printed, even though there are already characters waiting! Can anyone explain this behavior to me? I am seeing something similar in a stream tunneling program I have written and it's wrecking the entire affair.

Thanks for reading!

like image 577
tvynr Avatar asked May 15 '11 21:05

tvynr


1 Answers

The read method of sys.stdin works at a higher level of abstraction than select. When you do ans[0].read(1), python actually reads a larger number of bytes from the operating system and buffers them internally. select is not aware of this extra buffering; It only sees that everything has been read, and so will block until either an EOF or more input arrives. You can observe this behaviour by running something like strace -e read,select python yourprogram.py.

One solution is to replace ans[0].read(1) with os.read(ans[0].fileno(), 1). os.read is a lower level interface without any buffering between it and the operating system, so it's a better match for select.

Alternatively, running python with the -u commandline option also seems to disable the extra buffering.

like image 96
slowdog Avatar answered Oct 19 '22 03:10

slowdog