How would you split a domain name that will return name and extension
Wow, there are a lot of bad answers here. You can only do this if you know what's on the public suffix list. If you are using split
or a regex or something else, you're doing this wrong.
Luckily, this is python, and there's a library for this: https://pypi.python.org/pypi/tldextract
From their readme:
>>> import tldextract
>>> tldextract.extract('http://forums.news.cnn.com/')
ExtractResult(subdomain='forums.news', domain='cnn', suffix='com')
ExtractResult
is a namedtuple. Makes it pretty easy.
The advantage of using a library like this is that they will keep up with the additions to the public suffix list so you don't have to.
In general, it's not easy to work out where the user-registered bit ends and the registry bit begins. For example: a.com, b.co.uk, c.us, d.ca.us, e.uk.com, f.pvt.k12.wy.us...
The nice people at Mozilla have a project dedicated to listing domain suffixes under which the public can register domains: http://publicsuffix.org/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With