Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does python urllib2 automatically uncompress gzip data fetched from webpage?

I'm using

 data=urllib2.urlopen(url).read() 

I want to know:

  1. How can I tell if the data at a URL is gzipped?

  2. Does urllib2 automatically uncompress the data if it is gzipped? Will the data always be a string?

like image 575
mlzboy Avatar asked Oct 16 '10 00:10

mlzboy


1 Answers

  1. How can I tell if the data at a URL is gzipped?

This checks if the content is gzipped and decompresses it:

from StringIO import StringIO import gzip  request = urllib2.Request('http://example.com/') request.add_header('Accept-encoding', 'gzip') response = urllib2.urlopen(request) if response.info().get('Content-Encoding') == 'gzip':     buf = StringIO(response.read())     f = gzip.GzipFile(fileobj=buf)     data = f.read() 
  1. Does urllib2 automatically uncompress the data if it is gzipped? Will the data always be a string?

No. The urllib2 doesn't automatically uncompress the data because the 'Accept-Encoding' header is not set by the urllib2 but by you using: request.add_header('Accept-Encoding','gzip, deflate')

like image 133
ars Avatar answered Sep 27 '22 19:09

ars