I've implemented a non-blocking reader in Python, and I need to make it more efficient.
The background: I have massive amounts of output that I need to read from one subprocess (started with Popen()) and pass to another thread. Reading the output from that subprocess must not block for more than a few ms (preferably for as little time as is necessary to read available bytes).
Currently, I have a utility class which takes a file descriptor (stdout) and a timeout. I select()
and readline(1)
until one of three things happens:
Then I return the buffered text to the calling method, which does stuff with it.
Now, for the real question: because I'm reading so much output, I need to make this more efficient. I'd like to do that by asking the file descriptor how many bytes are pending and then readline([that many bytes])
. It's supposed to just pass stuff through, so I don't actually care where the newlines are, or even if there are any. Can I ask the file descriptor how many bytes it has available for reading, and if so, how?
I've done some searching, but I'm having a really hard time figuring out what to search for, let alone if it's possible.
Even just a point in the right direction would be helpful.
Note: I'm developing on Linux, but that shouldn't matter for a "Pythonic" solution.
On Linux, os.pipe()
is just a wrapper around pipe(2). Both return a pair of file descriptors. Normally one would use lseek(2) (os.lseek()
in Python) to reposition the offset of a file decsriptor as a way to get the amount of available data. However, not all file descriptors capable of seeking.
On Linux trying lseek(2) on a pipe will return an error, see the manual page. That's because a pipe is more or less a buffer between a producer and a consumer of data. The size of that buffer is system dependant.
On Linux, a pipe has a 64 kB buffer, so that is the most data you can have available.
Edit: If you can change the way your subprocess works, you might consider using a memory mapped file, or a nice big piece of shared memory.
Edit2: Using polling objects is probably faster than select.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With