Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python read stream

I need a very inexpensive way of reading a buffer with no terminating string (a stream) in Python. This is what I have, but it wastes a a lot of CPU time and effort. Because it is constantly "trying and catching." I really need a new approach.

Here is a reduced working version of my code:

#! /usr/bin/env/ python
import fcntl, os, sys

if __name__ == "__main__":
    f = open("/dev/urandom", "r")
    fd = f.fileno()
    fl = fcntl.fcntl(fd, fcntl.F_GETFL)
    fcntl.fcntl(fd, fcntl.F_SETFL, fl | os.O_NONBLOCK)

    ready = False
    line = ""
    while True:
        try:
            char = f.read()
            if char == '\r':
                continue
            elif char = '\n':
                ready = True
            else:
                line += char
        except:
            continue
        if ready:
            print line

Don't run this in the terminal. It's simply for illustration. "urandom" will break your terminal because it spits out a lot of random characters that the terminal emulator interprets no matter what (which can change your current shells settings, title, etc). I was reading from a gps connected via usb.

The problem: this uses 100% of the CPU usage when it can. I have tried this:

#! /usr/bin/env/ python
import fcntl, os, sys

if __name__ == "__main__":
    f = open("/dev/urandom", "r")
    fd = f.fileno()
    fl = fcntl.fcntl(fd, fcntl.F_GETFL)
    fcntl.fcntl(fd, fcntl.F_SETFL, fl | os.O_NONBLOCK)

    for line in f.readlines():
        print line

However, I get IOError: [Errno 11] Resource temporarily unavailable. I have tried to use Popen amongst other things. I am at a loss. Can someone please provide a solution (and please explain everything, as I am not a pro, per se). Also, I should note that this is for Unix (particularly Linux, but it must be portable across all versions of Linux).

like image 709
dylnmc Avatar asked Sep 30 '14 18:09

dylnmc


2 Answers

You will want to set your buffering mode to the size of the chunk you want to read when opening the file stream. From python documentation:

io.open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True)

"buffering is an optional integer used to set the buffering policy. Pass 0 to switch buffering off (only allowed in binary mode), 1 to select line buffering (only usable in text mode), and an integer > 1 to indicate the size of a fixed-size chunk buffer."

You also want to use the readable() method in the while loop, to avoid unnecessary resource consumption.

However, I advise you to use buffered streams such as io.BytesIO or io.BufferedReader

More info in the docs.

like image 108
Anoyz Avatar answered Oct 20 '22 11:10

Anoyz


The simple solutions are the best:

with open('/dev/urandom', 'r') as f:
    for line in f:
        print line.encode('hex')  # Don't mess up my terminal

Or, alterantively

with open('/dev/urandom', 'r') as f:
    for line in iter(f.readline, ''):
        print line.encode('hex')  # Don't mess up my terminal

Notes:

  • Leave the file descriptor in blocking mode, so the OS can block your process (and save CPU time) when there is no data available.

  • It is important to use an iterator in the loop. Consider for line in f.readlines():. f.readlines() reads all of the data, puts it all in a list, and returns that list. Since we have infinite data, f.readlines() will never return successfully. In contrast, f returns an iterator -- it only gets as much data as it needs to satisfy the next loop iteration (and just a little more for a performance buffer.)

  • The first version reads ahead and buffers enough data to print several lines. The second version returns each line immediately. Use the first version if conserving CPU is your primary concern. Use the second if interactive response time is your primary concern.

Demonstration:

$ python x.py  | head -2l
eb99f1b3bf74eead42750c63cb7c16160fa7e21c94b176dc6fd2d6796a1428dc8c5d15f13e3c1d5969cb59317eaba37a97f4719bb3de87919009da013fa06ae738408478bc15c750850744a4edcc27d155749d840680bf3a827aafbe9be84e7c8e2fe5785d2305cbedd76454573ca9261ac9a480f71242baa94e8d4bdf761705a6a0fea1ba2b1502066b2538a62776e9165043e5b7337d45773d009fd06d15ca0d9b51af499c1c9d7684472272a7361751d220848874215bc494456b08910e9815fc533d3545129aad4f3f126dc5341266ca4b85ea949794cacaf16409bcd02263b08613190b3f69caa68a47758345dafb10121cfe6ed6c8098142682aef47d1080bd2e218b571824bf2fa5d0bb5297278be8a9a2f55b554631c99e5f1d9040c5bc2bde9a40c8b6e95fc47be6ea9235243582f2367893d15a1494f732d0346ec6184a366f8035aef9141c638128444b1549a64937697b1a170e648d20f336e352076893fa7265c8fa0f4e2207e87410e53b43a51aa146ac6c2decf274a45a58c4e442aececf28879a3e0b4a1278eac7a4f969b3f74e2f2a2064a55ff112c4c49092366dbaa125703962ec5083d09cdb750c0e1dbe34cadda66709f98ff63faccf0045993137bfaca949686bc395bbafb7cf9b5b3475a0c91bdea8cec4e9ac1a9c96e0b81c1c5f242ae72cdea4c073db0351322f9da31203ea34d1b6f298128435797f4846a53b0733069060680dbc2b44c662c4b685ced5419b65c01df41cc2dd9877dc2a97a965174d508a3c9275d8aee7f2991bbb06ca7e0010b0e5b9468aed12f5d2c9a65091223547b8655211df435ffbf24768d48c7e7cf3cb7225f2c116e94a8602078f2b34dab6852f57708e760f88f4085ec7dade19ed558a539f830adea1b81f46303789224802f1f090ec0ff59e291246f1287672b0035df07c359d2ada48e674622f61c0f456c36d130fb6cf7f529e7c4dfceccc594ba5e812a3250e022eca9576a5a8b31c0be13969841d5a4d52b10a7dc8ddd1cac279500cb66e3b244e7d1e042249fd8adf2a90fa8bee74378d79a3d55c6fcf6cc19aa85ffb078dba23ca88ea6810d4a1c5d98b3b33e68ddd41c881df167c36ab2e1b081849781e08e3a026fbd3755acf9f215e0402cbf1a021300f5c883f86a05d467479172109a8f20f93c8c255915a264463eb113c3e8d07d0cec31aa8c0f978a0e7e65c142e85383befd6679c69edd2c56599f15580bbb356d98cfdf012dbc6d1dd6c0dbcfe6f8235d3d5c015fb94d8cc29afdf5d69e33d0e5078d651782546bc2acccab9f35e595f0951a139526ae5651a3ebbec353e99f9ddd1615ed25529500dabe8bf6f12ee6b21a437caca12a6d9688986d94fb7c103dca1572350900e56276b857630a
9d024ef4454dcd5e35dd605a2d49c26ce44fae87ab33e7a158d328521c7d77969908ec5b67f01bf8e2c330dcb70b5f3def8e6d4b010c6d31e4cbe7478657782f10b6fc2d77e8ff7a2f1e590827827e1037b33b0a
Traceback (most recent call last):
  File "x.py", line 4, in <module>
    print line.encode('hex')  # Don't mess up my terminal
IOError: [Errno 32] Broken pipe
like image 12
Robᵩ Avatar answered Oct 20 '22 11:10

Robᵩ