Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python domain name split name and extension

Tags:

python

string

How would you split a domain name that will return name and extension

like image 515
bobsr Avatar asked May 26 '10 08:05

bobsr


2 Answers

Wow, there are a lot of bad answers here. You can only do this if you know what's on the public suffix list. If you are using split or a regex or something else, you're doing this wrong.

Luckily, this is python, and there's a library for this: https://pypi.python.org/pypi/tldextract

From their readme:

>>> import tldextract
>>> tldextract.extract('http://forums.news.cnn.com/')
ExtractResult(subdomain='forums.news', domain='cnn', suffix='com')

ExtractResult is a namedtuple. Makes it pretty easy.

The advantage of using a library like this is that they will keep up with the additions to the public suffix list so you don't have to.

like image 172
mlissner Avatar answered Sep 26 '22 07:09

mlissner


In general, it's not easy to work out where the user-registered bit ends and the registry bit begins. For example: a.com, b.co.uk, c.us, d.ca.us, e.uk.com, f.pvt.k12.wy.us...

The nice people at Mozilla have a project dedicated to listing domain suffixes under which the public can register domains: http://publicsuffix.org/

like image 27
Andrew Aylett Avatar answered Sep 24 '22 07:09

Andrew Aylett