I'm trying to learn how urllib2 works and how it encapsulates its various components before sending out an actual request or response.
So far I have:
theurl = "www.example.com"
That obviously specifies the URL to look at.
req = urllib2.Request(theurl)
Don't know what this does, hence the question.
handle = urllib2.urlopen(req)
This one gets the page and does all the requests and responses required.
So my question is, what does urllib2.Request actually do?
To try and look at it to get an idea I tried
print req
and just got
<urllib2.Request instance at 0x123456789>
I also tried
print req.read()
and got:
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/lib64/python2.4/urllib2.py, line 207, in `__`getattr`__`
raise AttributeError, attr
AttributeError: read
So I'm obviously doing something wrong. If anyone can help in one of both my questions that would be great.
The class "Request" you're asking about: http://docs.python.org/library/urllib2.html#urllib2.Request
class urllib2.Request(url[, data][, headers][, origin_req_host][, unverifiable])
This class is an abstraction of a URL request.
The function you actually want to make a request (which can accept a Request
object or wrap one around a URL string you provice) constructing a Request object): http://docs.python.org/library/urllib2.html#urllib2.urlopen
urllib2.urlopen(url[, data][,timeout]) Open the URL url, which can be either a string or a Request object.
Example:
theurl = "www.example.com"
try:
resp = urllib2.urlopen(theurl)
print resp.read()
except IOError as e:
print "Error: ", e
Example 2 (with Request
):
theurl = "www.example.com"
try:
req = urllib2.Request(theurl)
print req.get_full_url()
print req.get_method()
print dir(req) # list lots of other stuff in Request
resp = urllib2.urlopen(req)
print resp.read()
except IOError as e:
print "Error: ", e
urllib2.Request()
looks like a function call, but isn't - it's an object constructor. It creates an object of type Request from the urllib2 module, documented here.
As such, it probably doesn't do anything except initialise itself. You can verify this by looking at the source code, which should be in your Python installation's lib directory (urllib2.py, at least in Python 2.x).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With