I used the following to extract the domain from a url: (They are test cases)
String regex = "^(ww[a-zA-Z0-9-]{0,}\\.)";
ArrayList<String> cases = new ArrayList<String>();
cases.add("www.google.com");
cases.add("ww.socialrating.it");
cases.add("www-01.hopperspot.com");
cases.add("wwwsupernatural-brasil.blogspot.com");
cases.add("xtop10.net");
cases.add("zoyanailpolish.blogspot.com");
for (String t : cases) {
String res = t.replaceAll(regex, "");
}
I can get the following results:
google.com
hopperspot.com
socialrating.it
blogspot.com
xtop10.net
zoyanailpolish.blogspot.com
The first four cases are good. The last one is not good. What I want is: blogspot.com
for the last one, but it gives zoyanailpolish.blogspot.com
. What am I doing wrong?
The =REGEXREPLACE() function is built-in Google Sheets and it extracts domains from URLs. What's great about is it's only a simple line of code that you can paste into your cell. The function is not super technical and you can change it any way you see fit.
You can use the "whois" command to lookup the suffix for a given domain name. For example, if you enter "whois example.com", the output will return ".com".
Using Guava library, we can easily get domain name:
InternetDomainName.from(tld).topPrivateDomain()
Refer API link for more details
https://google.github.io/guava/releases/14.0/api/docs/
http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/net/InternetDomainName.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With