I made a little parser using HTMLparser and I would like to know where a link is redirected. I don't know how to explain this, so please look this example: On my page I have a link on the source: <code>http://www.myweb.com?out=147</code>, which redirects to <code>http://www.mylink.com</code>. I can parse <code>http://www.myweb.com?out=147</code> without any problems, but I don't know how to get <code>http://www.mylink.com</code>.

You can use <code>urllib2</code> (<code>urllib.request</code> in Python 3) and its <code>HTTPRedirectHandler</code> in order to find out where a URL will redirect you. Here's a function that does that: <pre class="prettyprint"><code>import urllib2 def get_redirected_url(url): opener = urllib2.build_opener(urllib2.HTTPRedirectHandler) request = opener.open(url) return request.url print get_redirected_url("http://google.com/") # prints "http://www.google.com/" </code></pre>

Determining redirected URL in Python

Tags:

python

redirect

parsing

I made a little parser using HTMLparser and I would like to know where a link is redirected. I don't know how to explain this, so please look this example:

On my page I have a link on the source: http://www.myweb.com?out=147, which redirects to http://www.mylink.com. I can parse http://www.myweb.com?out=147 without any problems, but I don't know how to get http://www.mylink.com.

470

asked Apr 04 '11 12:04

MisterU

1 Answers

You can use urllib2 (urllib.request in Python 3) and its HTTPRedirectHandler in order to find out where a URL will redirect you. Here's a function that does that:

import urllib2

def get_redirected_url(url):
    opener = urllib2.build_opener(urllib2.HTTPRedirectHandler)
    request = opener.open(url)
    return request.url

print get_redirected_url("http://google.com/")
# prints "http://www.google.com/"

200

answered Sep 19 '22 22:09

Jeremy

Related questions
                            
                                Making Matplotlib run faster
                            
                                pyPdf unable to extract text from some pages in my PDF
                            
                                How to override Python list(iterator) behaviour?
                            
                                Fastest Way to Round to the Nearest 5/100ths
                            
                                How to dump a boolean matrix in numpy?
                            
                                Python regex convert youtube url to youtube video
                            
                                Remove all articles, connector words, etc., from a string in Python
                            
                                What is faster for loop using enumerate or for loop using xrange in Python?
                            
                                Automatic image scaling on resize with (Py)GTK
                            
                                Iterate through parts of a string
                            
                                Best approach to use jira programmatically
                            
                                Python + nose: make assertions about logged text?
                            
                                How can I let android emulator talk to the localhost?
                            
                                Store a lot of data inside python
                            
                                Algorithm to determine the winner of a Texas Hold'em Hand
                            
                                How to draw probabilistic distributions with numpy/matplotlib?
                            
                                CSV Module's writer won't let me write binary out
                            
                                Getting command line arguments as tuples in python
                            
                                How do I url encode in Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With