my problem is the following:
My pythons script receives data via sys.stdin, but it needs to wait until new data is available on sys.stdin.
As described in the manpage from python, i use the following code but it totally overloads my cpu.
#!/usr/bin/python -u
import sys
while 1:
for line in sys.stdin.readlines():
do something useful
Is there any good way to solve the high cpu usage?
Edit:
All your solutions don't work. I give you exactly my problem.
You can configure the apache2 daemon that he sends every logline to a program and not to write in a logfile.
This looks something like that:
CustomLog "|/usr/bin/python -u /usr/local/bin/client.py" combined
Apache2 expects from my script that it runs always, waits for data on sys.stdin and parses it then there is data.
If i only use a for loop the script will exit, because at a point there is no data in sys.stdin and apache2 will say ohh your script exited unexpectedly.
If i use a while true loop my script will use 100% cpu usage.
We can use input() function to achieve this. In this case, the program will wait indefinitely for the user input. Once the user provides the input data and presses the enter key, the program will start executing the next statements. sec = input('Let us wait for user input.
stdin. readline is actually for Faster Inputs, because line reading through System STDIN (Standard Input) is faster in Python.
stdin. read() method accepts a line as the input from the user until a special character like Enter Key and followed by Ctrl + D and then stores the input as the string.
The following should just work.
import sys
for line in sys.stdin:
# whatever
Rationale:
The code will iterate over lines in stdin as they come in. If the stream is still open, but there isn't a complete line then the loop will hang until either a newline character is encountered (and the whole line returned) or the stream is closed (and the whatever is left in the buffer is returned).
Once the stream has been closed, no more data can be written to or read from stdin. Period.
The reason that your code was overloading your cpu is that once the stdin has been closed any subsequent attempts to iterate over stdin will return immediately without doing anything. In essence your code was equivalent to the following.
for line in sys.stdin:
# do something
while 1:
pass # infinite loop, very CPU intensive
Maybe it would be useful if you posted how you were writing data to stdin.
EDIT:
Python will (for the purposes of for loops, iterators and readlines() consider a stream closed when it encounters an EOF character. You can ask python to read more data after this, but you cannot use any of the previous methods. The python man page recommends using
import sys
while True:
line = sys.stdin.readline()
# do something with line
When an EOF character is encountered readline will return an empty string. The next call to readline will function as normal if the stream is still open. You can test this out yourself by running the command in a terminal. Pressing ctrl+D will cause a terminal to write the EOF character to stdin. This will cause the first program in this post to terminate, but the last program will continue to read data until the stream is actually closed. The last program should not 100% your CPU as readline will wait until there is data to return rather than returning an empty string.
I only have the problem of a busy loop when I try readline from an actual file. But when reading from stdin, readline happily blocks.
This actually works flawlessly (i.e. no runnaway CPU) - when you call the script from the shell, like so:
tail -f input-file | yourscript.py
Obviously, that is not ideal - since you then have to write all relevant stdout to that file -
but it works without a lot of overhead!
Namely because of using readline()
- I think:
while 1:
line = sys.stdin.readline()
It will actually stop and wait at that line until it gets more input.
Hope this helps someone!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With