Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Requests library redirect new url

I've been looking through the Python Requests documentation but I cannot see any functionality for what I am trying to achieve.

In my script I am setting allow_redirects=True.

I would like to know if the page has been redirected to something else, what is the new URL.

For example, if the start URL was: www.google.com/redirect

And the final URL is www.google.co.uk/redirected

How do I get that URL?

like image 368
Daniel Pilch Avatar asked Dec 09 '13 16:12

Daniel Pilch


People also ask

How do you get a redirected URL in Python?

Use Python urllib Library To Get Redirection URL. request module. Define a web page URL, suppose this URL will be redirected when you send a request to it. Get the response object. Get the webserver returned response status code, if the code is 301 then it means the URL has been redirected permanently.

Does fetch follow redirects?

Normally, fetch transparently follows HTTP-redirects, like 301, 302 etc.

How do I create a redirect URL?

Redirects allow you to forward the visitors of a specific URL to another page of your website. In Site Tools, you can add redirects by going to Domain > Redirects. Choose the desired domain, fill in the URL you want to redirect to another and add the URL of the new page destination. When ready, click Create.


5 Answers

You are looking for the request history.

The response.history attribute is a list of responses that led to the final URL, which can be found in response.url.

response = requests.get(someurl)
if response.history:
    print("Request was redirected")
    for resp in response.history:
        print(resp.status_code, resp.url)
    print("Final destination:")
    print(response.status_code, response.url)
else:
    print("Request was not redirected")

Demo:

>>> import requests
>>> response = requests.get('http://httpbin.org/redirect/3')
>>> response.history
(<Response [302]>, <Response [302]>, <Response [302]>)
>>> for resp in response.history:
...     print(resp.status_code, resp.url)
... 
302 http://httpbin.org/redirect/3
302 http://httpbin.org/redirect/2
302 http://httpbin.org/redirect/1
>>> print(response.status_code, response.url)
200 http://httpbin.org/get
like image 194
Martijn Pieters Avatar answered Oct 19 '22 00:10

Martijn Pieters


This is answering a slightly different question, but since I got stuck on this myself, I hope it might be useful for someone else.

If you want to use allow_redirects=False and get directly to the first redirect object, rather than following a chain of them, and you just want to get the redirect location directly out of the 302 response object, then r.url won't work. Instead, it's the "Location" header:

r = requests.get('http://github.com/', allow_redirects=False)
r.status_code  # 302
r.url  # http://github.com, not https.
r.headers['Location']  # https://github.com/ -- the redirect destination
like image 39
hwjp Avatar answered Oct 19 '22 01:10

hwjp


the documentation has this blurb https://requests.readthedocs.io/en/master/user/quickstart/#redirection-and-history

import requests

r = requests.get('http://www.github.com')
r.url
#returns https://www.github.com instead of the http page you asked for 
like image 47
Back2Basics Avatar answered Oct 19 '22 00:10

Back2Basics


I think requests.head instead of requests.get will be more safe to call when handling url redirect. Check a GitHub issue here:

r = requests.head(url, allow_redirects=True)
print(r.url)
like image 44
Geng Jiawen Avatar answered Oct 18 '22 23:10

Geng Jiawen


For python3.5, you can use the following code:

import urllib.request
res = urllib.request.urlopen(starturl)
finalurl = res.geturl()
print(finalurl)
like image 13
Shuai.Z Avatar answered Oct 19 '22 00:10

Shuai.Z