Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get Request Headers for Urllib2.Request?

Tags:

python

urllib2

Is there a way to get the headers from a request created with Urllib2 or to confirm the HTTP headers sent with urllib2.urlopen?

like image 542
wag2639 Avatar asked Jun 14 '11 08:06

wag2639


People also ask

What does Urllib request return?

This function always returns an object which can work as a context manager and has the properties url, headers, and status. See urllib.

Is urllib2 deprecated?

urllib2 is deprecated in python 3. x. use urllib instaed.

Is Urllib request the same as request?

Requests - Requests' is a simple, easy-to-use HTTP library written in Python. 1) Python Requests encodes the parameters automatically so you just pass them as simple arguments, unlike in the case of urllib, where you need to use the method urllib. encode() to encode the parameters before passing them.

What does Urlopen return?

The data returned by urlopen() or urlretrieve() is the raw data returned by the server. This may be binary data (such as an image), plain text or (for example) HTML. The HTTP protocol provides type information in the reply header, which can be inspected by looking at the Content-Type header.


1 Answers

An easy way to see request (and response headers) is to enable debug output:

opener = urllib2.build_opener(urllib2.HTTPHandler(debuglevel=1))

You then can see the precise headers sent/recieved:

>>> opener.open('http://python.org')
send: 'GET / HTTP/1.1\r\nAccept-Encoding: identity\r\nHost: python.org\r\nConnection: close\r\nUser-Agent: Python-urllib/2.7\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Date: Tue, 14 Jun 2011 08:23:35 GMT
header: Server: Apache/2.2.16 (Debian)
header: Last-Modified: Mon, 13 Jun 2011 19:41:35 GMT
header: ETag: "105800d-486d-4a59d1b6699c0"
header: Accept-Ranges: bytes
header: Content-Length: 18541
header: Connection: close
header: Content-Type: text/html
header: X-Pad: avoid browser bug
<addinfourl at 140175550177224 whose fp = <socket._fileobject object at 0x7f7d29c3d5d0>>

You can also set with the urllib2.Request objects headers before making the request (and override the default headers, although won't be present in the headers dict beforehand):

>>> req = urllib2.Request(url='http://python.org')
>>> req.add_header('User-Agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:5.0)')
>>> req.headers
{'User-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:5.0)'}
like image 152
zeekay Avatar answered Oct 04 '22 06:10

zeekay