Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I un-shorten a URL using python?

I have seen this thread already - How can I unshorten a URL?

My issue with the resolved answer (that is using the unshort.me API) is that I am focusing on unshortening youtube links. Since unshort.me is used readily, this returns almost 90% of the results with captchas which I am unable to resolve.

So far I am stuck with using:

def unshorten_url(url):
    resolvedURL = urllib2.urlopen(url)  
    print resolvedURL.url

    #t = Test()
    #c = pycurl.Curl()
    #c.setopt(c.URL, 'http://api.unshort.me/?r=%s&t=xml' % (url))
    #c.setopt(c.WRITEFUNCTION, t.body_callback)
    #c.perform()
    #c.close()
    #dom = xml.dom.minidom.parseString(t.contents)
    #resolvedURL = dom.getElementsByTagName("resolvedURL")[0].firstChild.nodeValue
    return resolvedURL.url

Note: everything in the comments is what I tried to do when using the unshort.me service which was returning captcha links.

Does anyone know of a more efficient way to complete this operation without using open (since it is a waste of bandwidth)?

like image 832
brandonmat Avatar asked Aug 22 '11 20:08

brandonmat


1 Answers

one line functions, using requests library and yes, it supports recursion.

def unshorten_url(url):
    return requests.head(url, allow_redirects=True).url
like image 130
bersam Avatar answered Sep 18 '22 14:09

bersam