Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I create a GzipFile instance from the “file-like object” that urllib.urlopen() returns?

I’m playing around with the Stack Overflow API using Python. I’m trying to decode the gzipped responses that the API gives.

import urllib, gzip

url = urllib.urlopen('http://api.stackoverflow.com/1.0/badges/name')
gzip.GzipFile(fileobj=url).read()

According to the urllib2 documentation, urlopen “returns a file-like object”.

However, when I run read() on the GzipFile object I’ve created using it, I get this error:

AttributeError: addinfourl instance has no attribute 'tell'

As far as I can tell, this is coming from the object returned by urlopen.

It doesn’t appear to have seek either, as I get an error when I do this:

url.read()
url.seek(0)

What exactly is this object, and how do I create a functioning GzipFile instance from it?

like image 710
Paul D. Waite Avatar asked Nov 17 '10 13:11

Paul D. Waite


2 Answers

The urlopen docs list the supported methods of the object that is returned. I recommend wrapping the object in another class that supports the methods that gzip expects.

Other option: call the read method of the response object and put the result in a StringIO object (which should support all methods that gzip expects). This maybe a little more expensive though.

E.g.

import gzip
import json
import StringIO
import urllib

url = urllib.urlopen('http://api.stackoverflow.com/1.0/badges/name')
url_f = StringIO.StringIO(url.read())
g = gzip.GzipFile(fileobj=url_f)
j = json.load(g)
like image 74
stefanw Avatar answered Oct 19 '22 09:10

stefanw


import urllib2
import json
import gzip
import io

url='http://api.stackoverflow.com/1.0/badges/name'
page=urllib2.urlopen(url)
gzip_filehandle=gzip.GzipFile(fileobj=io.BytesIO(page.read()))
json_data=json.loads(gzip_filehandle.read())
print(json_data)

io.BytesIO is for Python2.6+. For older versions of Python, you could use cStringIO.StringIO.

like image 32
unutbu Avatar answered Oct 19 '22 09:10

unutbu