Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does urllib2.Request(<url>) do and how do i print/view it

Tags:

python

I'm trying to learn how urllib2 works and how it encapsulates its various components before sending out an actual request or response.

So far I have:

theurl = "www.example.com"

That obviously specifies the URL to look at.

req = urllib2.Request(theurl) 

Don't know what this does, hence the question.

handle = urllib2.urlopen(req)

This one gets the page and does all the requests and responses required.

So my question is, what does urllib2.Request actually do?

To try and look at it to get an idea I tried

print req 

and just got

<urllib2.Request instance at 0x123456789>

I also tried

print req.read() 

and got:

Traceback (most recent call last):  
    File "<stdin>", line 1, in ?  
    File "/usr/lib64/python2.4/urllib2.py, line 207, in `__`getattr`__`  
        raise AttributeError, attr  
AttributeError: read

So I'm obviously doing something wrong. If anyone can help in one of both my questions that would be great.

like image 909
user788462 Avatar asked Jun 23 '11 01:06

user788462


2 Answers

The class "Request" you're asking about: http://docs.python.org/library/urllib2.html#urllib2.Request

class urllib2.Request(url[, data][, headers][, origin_req_host][, unverifiable])

This class is an abstraction of a URL request.

The function you actually want to make a request (which can accept a Request object or wrap one around a URL string you provice) constructing a Request object): http://docs.python.org/library/urllib2.html#urllib2.urlopen

urllib2.urlopen(url[, data][,timeout]) Open the URL url, which can be either a string or a Request object.

Example:

theurl = "www.example.com"
try:
    resp = urllib2.urlopen(theurl)
    print resp.read()
except IOError as e:
    print "Error: ", e

Example 2 (with Request):

theurl = "www.example.com"
try:
    req = urllib2.Request(theurl)
    print req.get_full_url()
    print req.get_method()
    print dir(req)  # list lots of other stuff in Request
    resp = urllib2.urlopen(req)
    print resp.read()
except IOError as e:
    print "Error: ", e
like image 50
dkamins Avatar answered Sep 20 '22 08:09

dkamins


urllib2.Request() looks like a function call, but isn't - it's an object constructor. It creates an object of type Request from the urllib2 module, documented here.

As such, it probably doesn't do anything except initialise itself. You can verify this by looking at the source code, which should be in your Python installation's lib directory (urllib2.py, at least in Python 2.x).

like image 20
MatthewD Avatar answered Sep 19 '22 08:09

MatthewD