I currently use mechanize to read gzipped web page as below:
br = mechanize.Browser()
br.set_handle_gzip(True)
response = br.open(url)
data = response.read()
I wonder how to decompress gzipped data fetched by urllib2 to HTML text?
req = urllib2.Request(url)
opener = urllib2.build_opener()
response = opener.open(req)
data = response.read()
if response.info()['content-encoding'] == 'gzip':
HOW TO DECOMPRESS DATA TO HTML
Try this:
import StringIO
data = StringIO.StringIO(data)
import gzip
gzipper = gzip.GzipFile(fileobj=data)
html = gzipper.read()
html
should now hold the HTML (Print it to see). See here for more info.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With