I'm trying to do a HEAD request of a page using Python 2.
I am trying
import misc_urllib2 ..... opender = urllib2.build_opener([misc_urllib2.MyHTTPRedirectHandler(), misc_urllib2.HeadRequest()])
with misc_urllib2.py
containing
class HeadRequest(urllib2.Request): def get_method(self): return "HEAD" class MyHTTPRedirectHandler(urllib2.HTTPRedirectHandler): def __init__ (self): self.redirects = [] def http_error_301(self, req, fp, code, msg, headers): result = urllib2.HTTPRedirectHandler.http_error_301( self, req, fp, code, msg, headers) result.redirect_code = code return result http_error_302 = http_error_303 = http_error_307 = http_error_301
But I am getting
TypeError: __init__() takes at least 2 arguments (1 given)
If I just do
opender = urllib2.build_opener(misc_urllib2.MyHTTPRedirectHandler())
then it works fine
urllib2 offers a very simple interface, in the form of the urlopen function. Just pass the URL to urlopen() to get a “file-like” handle to the remote data. like basic authentication, cookies, proxies and so on. These are provided by objects called handlers and openers.
urllib2 is deprecated in python 3. x. use urllib instaed.
1) urllib2 can accept a Request object to set the headers for a URL request, urllib accepts only a URL. 2) urllib provides the urlencode method which is used for the generation of GET query strings, urllib2 doesn't have such a function. This is one of the reasons why urllib is often used along with urllib2.
Urllib package is the URL handling module for python. It is used to fetch URLs (Uniform Resource Locators). It uses the urlopen function and is able to fetch URLs using a variety of different protocols. Urllib is a package that collects several modules for working with URLs, such as: urllib.
This works just fine:
import urllib2 request = urllib2.Request('http://localhost:8080') request.get_method = lambda : 'HEAD' response = urllib2.urlopen(request) print response.info()
Tested with quick and dirty HTTPd hacked in python:
Server: BaseHTTP/0.3 Python/2.6.6 Date: Sun, 12 Dec 2010 11:52:33 GMT Content-type: text/html X-REQUEST_METHOD: HEAD
I've added a custom header field X-REQUEST_METHOD to show it works :)
Here is HTTPd log:
Sun Dec 12 12:52:28 2010 Server Starts - localhost:8080 localhost.localdomain - - [12/Dec/2010 12:52:33] "HEAD / HTTP/1.1" 200 -
Edit: there is also httplib2
import httplib2 h = httplib2.Http() resp = h.request("http://www.google.com", 'HEAD')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With