I'm using the urllib2.urlopen
method to open a URL and fetch the markup of a webpage. Some of these sites redirect me using the 301/302 redirects. I would like to know the final URL that I've been redirected to. How can I get this?
The data returned by urlopen() or urlretrieve() is the raw data returned by the server. This may be binary data (such as an image), plain text or (for example) HTML. The HTTP protocol provides type information in the reply header, which can be inspected by looking at the Content-Type header.
Use Python urllib Library To Get Redirection URL. request module. Define a web page URL, suppose this URL will be redirected when you send a request to it. Get the response object. Get the webserver returned response status code, if the code is 301 then it means the URL has been redirected permanently.
request is a Python module for fetching URLs (Uniform Resource Locators). It offers a very simple interface, in the form of the urlopen function. This is capable of fetching URLs using a variety of different protocols.
Click the URL Redirects tab. In the upper right, click Add URL redirect. In the right panel, select the Standard or Flexible redirect type. A standard redirect is used to redirect one URL to another.
Call the .geturl()
method of the file object returned. Per the urllib2
docs:
geturl()
— return the URL of the resource retrieved, commonly used to determine if a redirect was followed
Example:
import urllib2
response = urllib2.urlopen('http://tinyurl.com/5b2su2')
response.geturl() # 'http://stackoverflow.com/'
The return value of urllib2.urlopen
has a geturl()
method which should return the actual (i.e. last redirect) url.
e.g.:
urllib2.urlopen('ORIGINAL LINK').geturl()
urllib2.urlopen(urllib2.Request('ORIGINAL LINK')).geturl()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With