Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get domain name from URL

Tags:

regex

url

How can I fetch a domain name from a URL String?

Examples:

+----------------------+------------+ | input                | output     | +----------------------+------------+ | www.google.com       | google     | | www.mail.yahoo.com   | mail.yahoo | | www.mail.yahoo.co.in | mail.yahoo | | www.abc.au.uk        | abc        | +----------------------+------------+ 

Related:

  • Matching a web address through regex
like image 426
Chinmay Avatar asked Feb 20 '09 11:02

Chinmay


People also ask

How do I find a domain name from an IP address?

You can use nslookup on the IP. Reverse DNS is defined with the . in-addr.arpa domain. this will ask 3.2.21.123.in-addr.arpa and yield the domain name (if there is one defined for reverse DNS).


1 Answers

I once had to write such a regex for a company I worked for. The solution was this:

  • Get a list of every ccTLD and gTLD available. Your first stop should be IANA. The list from Mozilla looks great at first sight, but lacks ac.uk for example so for this it is not really usable.
  • Join the list like the example below. A warning: Ordering is important! If org.uk would appear after uk then example.org.uk would match org instead of example.

Example regex:

.*([^\.]+)(com|net|org|info|coop|int|co\.uk|org\.uk|ac\.uk|uk|__and so on__)$ 

This worked really well and also matched weird, unofficial top-levels like de.com and friends.

The upside:

  • Very fast if regex is optimally ordered

The downside of this solution is of course:

  • Handwritten regex which has to be updated manually if ccTLDs change or get added. Tedious job!
  • Very large regex so not very readable.
like image 194
pi. Avatar answered Sep 27 '22 23:09

pi.