Given this :
URL u=new URL("someURL");
How do i identify the top level domain of the URL..
Guava provides a nice utility for this. It works as follow:
InternetDomainName.from("someurl.co.uk").publicSuffix()
will get you co.uk
InternetDomainName.from("someurl.de").publicSuffix()
will get you de
So you want to have the top-level domain part only?
//parameter urlString: a String
//returns: a String representing the TLD of urlString, or null iff urlString is malformed
private String getTldString(String urlString) {
URL url = null;
String tldString = null;
try {
url = new URL(urlString);
String[] domainNameParts = url.getHost().split("\\.");
tldString = domainNameParts[domainNameParts.length-1];
}
catch (MalformedURLException e) {
}
return tldString;
}
Let's test it!
@Test
public void identifyLocale() {
String ukString = "http://www.amazon.co.uk/Harry-Potter-Sheet-Complete-Series/dp/0739086731";
logger.debug("ukString TLD: {}", getTldString(ukString));
String deString = "http://www.amazon.de/The-Essential-George-Gershwin/dp/B00008GEOT";
logger.debug("deString TLD: {}", getTldString(deString));
String ceShiString = "http://例子.测试";
logger.debug("ceShiString TLD: {}", getTldString(ceShiString));
String dokimeString = "http://παράδειγμα.δοκιμή";
logger.debug("dokimeString TLD: {}", getTldString(dokimeString));
String nullString = null;
logger.debug("nullString TLD: {}", getTldString(nullString));
String lolString = "lol, this is a malformed URL, amirite?!";
logger.debug("lolString TLD: {}", getTldString(lolString));
}
Output:
ukString TLD: uk
deString TLD: de
ceShiString TLD: 测试
dokimeString TLD: δοκιμή
nullString TLD: null
lolString TLD: null
The host part of the url conforms to RFC 2732 according to the docs. It would imply that simply splitting the string you get from
String host = u.getHost();
would not be enough. You will need to ensure that you conform to the RFC 2732 when searching the host OR if you can guarantee that all addresses are of the form server.com then you can search for the last . in the string and grab the tld.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With