Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python wait until data is in sys.stdin

Tags:

python

wait

my problem is the following:

My pythons script receives data via sys.stdin, but it needs to wait until new data is available on sys.stdin.

As described in the manpage from python, i use the following code but it totally overloads my cpu.

#!/usr/bin/python -u
import sys
while 1:
     for line in sys.stdin.readlines():
         do something useful

Is there any good way to solve the high cpu usage?

Edit:

All your solutions don't work. I give you exactly my problem.

You can configure the apache2 daemon that he sends every logline to a program and not to write in a logfile.

This looks something like that:

CustomLog "|/usr/bin/python -u /usr/local/bin/client.py" combined

Apache2 expects from my script that it runs always, waits for data on sys.stdin and parses it then there is data.

If i only use a for loop the script will exit, because at a point there is no data in sys.stdin and apache2 will say ohh your script exited unexpectedly.

If i use a while true loop my script will use 100% cpu usage.

like image 531
Abalus Avatar asked Aug 14 '11 10:08

Abalus


People also ask

How do I wait for user input in Python?

We can use input() function to achieve this. In this case, the program will wait indefinitely for the user input. Once the user provides the input data and presses the enter key, the program will start executing the next statements. sec = input('Let us wait for user input.

Is stdin faster than input Python?

stdin. readline is actually for Faster Inputs, because line reading through System STDIN (Standard Input) is faster in Python.

How does Sys stdin read () work?

stdin. read() method accepts a line as the input from the user until a special character like Enter Key and followed by Ctrl + D and then stores the input as the string.


2 Answers

The following should just work.

import sys
for line in sys.stdin:
    # whatever

Rationale:

The code will iterate over lines in stdin as they come in. If the stream is still open, but there isn't a complete line then the loop will hang until either a newline character is encountered (and the whole line returned) or the stream is closed (and the whatever is left in the buffer is returned).

Once the stream has been closed, no more data can be written to or read from stdin. Period.

The reason that your code was overloading your cpu is that once the stdin has been closed any subsequent attempts to iterate over stdin will return immediately without doing anything. In essence your code was equivalent to the following.

for line in sys.stdin:
    # do something

while 1:
    pass # infinite loop, very CPU intensive

Maybe it would be useful if you posted how you were writing data to stdin.

EDIT:

Python will (for the purposes of for loops, iterators and readlines() consider a stream closed when it encounters an EOF character. You can ask python to read more data after this, but you cannot use any of the previous methods. The python man page recommends using

import sys
while True:
    line = sys.stdin.readline()
    # do something with line

When an EOF character is encountered readline will return an empty string. The next call to readline will function as normal if the stream is still open. You can test this out yourself by running the command in a terminal. Pressing ctrl+D will cause a terminal to write the EOF character to stdin. This will cause the first program in this post to terminate, but the last program will continue to read data until the stream is actually closed. The last program should not 100% your CPU as readline will wait until there is data to return rather than returning an empty string.

I only have the problem of a busy loop when I try readline from an actual file. But when reading from stdin, readline happily blocks.

like image 168
Dunes Avatar answered Oct 14 '22 19:10

Dunes


This actually works flawlessly (i.e. no runnaway CPU) - when you call the script from the shell, like so:

tail -f input-file | yourscript.py

Obviously, that is not ideal - since you then have to write all relevant stdout to that file -

but it works without a lot of overhead! Namely because of using readline() - I think:

while 1:
        line = sys.stdin.readline()

It will actually stop and wait at that line until it gets more input.

Hope this helps someone!

like image 23
rm-vanda Avatar answered Oct 14 '22 18:10

rm-vanda