Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python read website data line by line when available

Tags:

python

I am using urllib2 to read the data from the url, below is the code snippet :

data = urllib2.urlopen(urllink)
for lines in data.readlines():
  print lines

Url that I am opening is actually a cgi script which does some processing and prints the data in parallel. CGI script takes around 30 minutes to complete. So with the above code, I could see the output only after 3o minutes when the execution of CGI script is completed.

How can I read the data from the url as soon as it is available and print it.

like image 848
sarbjit Avatar asked Jun 01 '13 08:06

sarbjit


People also ask

How do I read a text file line by line in Python?

Method 1: Read a File Line by Line using readlines() readlines() is used to read all the lines at a single go and then return them as each line a string element in a list. This function can be used for small files, as it reads the whole file content to the memory, then split it into separate lines.

How do you go to the next line while reading a file in Python?

Python File next() Method Python file method next() is used when a file is used as an iterator, typically in a loop, the next() method is called repeatedly. This method returns the next input line, or raises StopIteration when EOF is hit.

Does Python readline include newline?

In addition to the for loop, Python provides three methods to read data from the input file. The readline method reads one line from the file and returns it as a string. The string returned by readline will contain the newline character at the end.


1 Answers

Just loop directly over the file object:

for line in data:
    print line

This reads the incoming data stream line by line (internally, the socket fileobject calls .readline() every time you iterate). This does assume that your server is sending data as soon as possible.

Calling .readlines() (plural) guarantees that you read the whole request before you start looping, don't do that.

Alternatively, use the requests library, which has more explicit support for request streaming:

import requests

r = requests.get(url, stream=True)

for line in r.iter_lines():
    if line: print line

Note that this only will work if your server starts streaming data immediately. If your CGI doesn't produce data until the process is complete, there is no point in trying to stream the data.

like image 158
Martijn Pieters Avatar answered Oct 02 '22 20:10

Martijn Pieters