Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I get the final redirect URL when using urllib2.urlopen?

Tags:

python

urllib2

I'm using the urllib2.urlopen method to open a URL and fetch the markup of a webpage. Some of these sites redirect me using the 301/302 redirects. I would like to know the final URL that I've been redirected to. How can I get this?

like image 629
Mridang Agarwalla Avatar asked Aug 24 '10 12:08

Mridang Agarwalla


People also ask

What does Urllib Urlopen return?

The data returned by urlopen() or urlretrieve() is the raw data returned by the server. This may be binary data (such as an image), plain text or (for example) HTML. The HTTP protocol provides type information in the reply header, which can be inspected by looking at the Content-Type header.

How do you get a redirected URL in Python?

Use Python urllib Library To Get Redirection URL. request module. Define a web page URL, suppose this URL will be redirected when you send a request to it. Get the response object. Get the webserver returned response status code, if the code is 301 then it means the URL has been redirected permanently.

What does Urllib request Urlopen do?

request is a Python module for fetching URLs (Uniform Resource Locators). It offers a very simple interface, in the form of the urlopen function. This is capable of fetching URLs using a variety of different protocols.

How do I add a redirect to a URL?

Click the URL Redirects tab. In the upper right, click Add URL redirect. In the right panel, select the Standard or Flexible redirect type. A standard redirect is used to redirect one URL to another.


3 Answers

Call the .geturl() method of the file object returned. Per the urllib2 docs:

geturl() — return the URL of the resource retrieved, commonly used to determine if a redirect was followed

Example:

import urllib2
response = urllib2.urlopen('http://tinyurl.com/5b2su2')
response.geturl() # 'http://stackoverflow.com/'
like image 69
mmmmmm Avatar answered Sep 30 '22 00:09

mmmmmm


The return value of urllib2.urlopen has a geturl() method which should return the actual (i.e. last redirect) url.

like image 4
Michael Avatar answered Sep 29 '22 22:09

Michael


e.g.: urllib2.urlopen('ORIGINAL LINK').geturl()

urllib2.urlopen(urllib2.Request('ORIGINAL LINK')).geturl()

like image 1
kevin Avatar answered Sep 29 '22 22:09

kevin