What is the proper way to get the domain from a URL without the subdomains?
In Java, from a string you can make a new URL(urlString) and call getHost() on the URL, but you have subdomains with it.
The problem is because there can be hosts like: subhost.example.com and subhost.example.co.uk
There are several other of these two part domains like co.uk (see the list on https://wiki.mozilla.org/TLD_List).
It seems to me the only correct way to get only the domain is to do a search through the TLD list, remove the TLD from the end of the host, and take away everything before the last period in the host. Is there an existing method that does this? I didn't see one in java.net.URL, and I checked apache commons a bit but couldn't find one there.
To recap, a subdomain is the portion of a URL that comes before the “main” domain name and the domain extension. For example, docs.themeisle.com . Subdomains can help you divide your website into logical parts or create separate sites, for example a separate blog for each sports team.
A new domain is a distinctly different website, while a subdomain is a directory on the main domain that operates independently of the main domain.
A subdomain is, as the name would suggest, an additional section of your main domain name. You create subdomains to help organize and navigate to different sections of your main website. Within your main domain, you can have as many subdomains as necessary to get to all of the different pages of your website.
You should only use subdomains if you have a good reason to do so. For example, you can use subdomains to rank for different keywords, target a specific market, or reach a different location or serve a language other than that of your main website. Subdirectories are files found under your primary domain.
I know this is a few years late but if anyone stumbles across this question try the following:
InternetDomainName.from("subhost.example.co.uk").topPrivateDomain().name
The above will return example.co.uk.
Not sure if the above answer is correct:
InternetDomainName.from("test.blogspot.com").topPrivateDomain() -> test.blogspot.com
This works better in my case:
InternetDomainName.from("test.blogspot.com").topDomainUnderRegistrySuffix() -> blogspot.com
Details: https://github.com/google/guava/wiki/InternetDomainNameExplained
The above solutions require you to add Guava. If you use OkHttp or Retrofit, you can also use
PublicSuffixDatabase.get().getEffectiveTldPlusOne("test.blogspot.com")
This gives you blogspot.com
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With