Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you get default headers in a urllib2 Request?

Tags:

python

urllib2

I have a Python web client that uses urllib2. It is easy enough to add HTTP headers to my outgoing requests. I just create a dictionary of the headers I want to add, and pass it to the Request initializer.

However, other "standard" HTTP headers get added to the request as well as the custom ones I explicitly add. When I sniff the request using Wireshark, I see headers besides the ones I add myself. My question is how do a I get access to these headers? I want to log every request (including the full set of HTTP headers), and can't figure out how.

any pointers?

in a nutshell: How do I get all the outgoing headers from an HTTP request created by urllib2?

like image 846
Corey Goldberg Avatar asked Mar 02 '09 20:03

Corey Goldberg


People also ask

What is Urllib request request?

The urllib. request module defines functions and classes which help in opening URLs (mostly HTTP) in a complex world — basic and digest authentication, redirections, cookies and more. See also. The Requests package is recommended for a higher-level HTTP client interface.

What is the difference between Urllib and urllib3?

The Python 3 standard library has a new urllib which is a merged/refactored/rewritten version of the older modules. urllib3 is a third-party package (i.e., not in CPython's standard library).

What does Urlopen return?

The data returned by urlopen() or urlretrieve() is the raw data returned by the server. This may be binary data (such as an image), plain text or (for example) HTML. The HTTP protocol provides type information in the reply header, which can be inspected by looking at the Content-Type header.


1 Answers

If you want to see the literal HTTP request that is sent out, and therefore see every last header exactly as it is represented on the wire, then you can tell urllib2 to use your own version of an HTTPHandler that prints out (or saves, or whatever) the outgoing HTTP request.

import httplib, urllib2

class MyHTTPConnection(httplib.HTTPConnection):
    def send(self, s):
        print s  # or save them, or whatever!
        httplib.HTTPConnection.send(self, s)

class MyHTTPHandler(urllib2.HTTPHandler):
    def http_open(self, req):
        return self.do_open(MyHTTPConnection, req)

opener = urllib2.build_opener(MyHTTPHandler)
response = opener.open('http://www.google.com/')

The result of running this code is:

GET / HTTP/1.1
Accept-Encoding: identity
Host: www.google.com
Connection: close
User-Agent: Python-urllib/2.6
like image 130
Brandon Rhodes Avatar answered Sep 19 '22 10:09

Brandon Rhodes