Is there a way to get the headers from a request created with Urllib2 or to confirm the HTTP headers sent with urllib2.urlopen?
This function always returns an object which can work as a context manager and has the properties url, headers, and status. See urllib.
urllib2 is deprecated in python 3. x. use urllib instaed.
Requests - Requests' is a simple, easy-to-use HTTP library written in Python. 1) Python Requests encodes the parameters automatically so you just pass them as simple arguments, unlike in the case of urllib, where you need to use the method urllib. encode() to encode the parameters before passing them.
The data returned by urlopen() or urlretrieve() is the raw data returned by the server. This may be binary data (such as an image), plain text or (for example) HTML. The HTTP protocol provides type information in the reply header, which can be inspected by looking at the Content-Type header.
An easy way to see request (and response headers) is to enable debug output:
opener = urllib2.build_opener(urllib2.HTTPHandler(debuglevel=1))
You then can see the precise headers sent/recieved:
>>> opener.open('http://python.org')
send: 'GET / HTTP/1.1\r\nAccept-Encoding: identity\r\nHost: python.org\r\nConnection: close\r\nUser-Agent: Python-urllib/2.7\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Date: Tue, 14 Jun 2011 08:23:35 GMT
header: Server: Apache/2.2.16 (Debian)
header: Last-Modified: Mon, 13 Jun 2011 19:41:35 GMT
header: ETag: "105800d-486d-4a59d1b6699c0"
header: Accept-Ranges: bytes
header: Content-Length: 18541
header: Connection: close
header: Content-Type: text/html
header: X-Pad: avoid browser bug
<addinfourl at 140175550177224 whose fp = <socket._fileobject object at 0x7f7d29c3d5d0>>
You can also set with the urllib2.Request
objects headers before making the request (and override the default headers, although won't be present in the headers dict beforehand):
>>> req = urllib2.Request(url='http://python.org')
>>> req.add_header('User-Agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:5.0)')
>>> req.headers
{'User-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:5.0)'}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With