non-blocking read/log from an http stream

Question

I have a client that connects to an HTTP stream and logs the text data it consumes.

I send the streaming server an HTTP GET request... The server replies and continuously publishes data... It will either publish text or send a ping (text) message regularly... and will never close the connection.

I need to read and log the data it consumes in a non-blocking manner.

I am doing something like this:

import urllib2

req = urllib2.urlopen(url)    
for dat in req: 
    with open('out.txt', 'a') as f:        
        f.write(dat)

My questions are:
will this ever block when the stream is continuous?
how much data is read in each chunk and can it be specified/tuned?
is this the best way to read/log an http stream?

Vinay Sajip · Accepted Answer

Hey, that's three questions in one! ;-)

It could block sometimes - even if your server is generating data quite quickly, network bottlenecks could in theory cause your reads to block.

Reading the URL data using "for dat in req" will mean reading a line at a time - not really useful if you're reading binary data such as an image. You get better control if you use

chunk = req.read(size)

which can of course block.

Whether it's the best way depends on specifics not available in your question. For example, if you need to run with no blocking calls whatever, you'll need to consider a framework like Twisted. If you don't want blocking to hold you up and don't want to use Twisted (which is a whole new paradigm compared to the blocking way of doing things), then you can spin up a thread to do the reading and writing to file, while your main thread goes on its merry way:

def func(req):
    #code the read from URL stream and write to file here

...

t = threading.Thread(target=func)
t.start() # will execute func in a separate thread
...
t.join() # will wait for spawned thread to die

Obviously, I've omitted error checking/exception handling etc. but hopefully it's enough to give you the picture.

non-blocking read/log from an http stream

Tags:

python

http

logging

urllib2

Corey Goldberg

1 Answers

Vinay Sajip

Recent Activity

Donate For Us

non-blocking read/log from an http stream

Tags:

python

http

logging

urllib2

Corey Goldberg

1 Answers

Vinay Sajip

Related questions

Recent Activity

Donate For Us