Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I prepend http to a url if it doesn't begin with http?

Tags:

I have urls formatted as:

google.com
www.google.com
http://google.com
http://www.google.com

I would like to convert all type of links to a uniform format, starting with http://

http://google.com

How can I prepend URLs with http:// using Python?

like image 746
PrivateUser Avatar asked Feb 09 '14 12:02

PrivateUser


1 Answers

Python do have builtin functions to treat that correctly, like

p = urlparse.urlparse(my_url, 'http')
netloc = p.netloc or p.path
path = p.path if p.netloc else ''
if not netloc.startswith('www.'):
    netloc = 'www.' + netloc

p = urlparse.ParseResult('http', netloc, path, *p[3:])
print(p.geturl())

If you want to remove (or add) the www part, you have to edit the .netloc field of the resulting object before calling .geturl().

Because ParseResult is a namedtuple, you cannot edit it in-place, but have to create a new object.

PS:

For Python3, it should be urllib.parse.urlparse

like image 181
JBernardo Avatar answered Oct 21 '22 13:10

JBernardo