Why is the 1st one returning null, while the 2nd one is returning mail.yahoo.com?
Isn't this weird? If not, what's the logic behind this behavior?
Is the underscore the culprit? Why?
public static void main(String[] args) throws Exception {
    java.net.URI uri = new java.net.URI("http://broken_arrow.huntingtonhelps.com");
    String host = uri.getHost();
    System.out.println("Host = [" + host + "].");
    uri = new java.net.URI("http://mail.yahoo.com");
    host = uri.getHost();
    System.out.println("Host = [" + host + "].");
}
                As mentioned in comments by @hsz it is known bug.
But, let's debug and look inside sources of URI class. The problem is inside the method:
private int parseHostname(int start, int n):
parsing first URI fails at lines if ((p < n) && !at(p, n, ':'))
                fail("Illegal character in hostname", p);
this is because _ symbol isn't foreseed inside scan block, it allows only alphas, digits and -symbol (L_ALPHANUM, H_ALPHANUM, L_DASH and H_DASH).
And yes, this is not fixed yet in Java 7.
It's because of underscore in base uri. Just Remove underscore to check that out.It's working.
Like given below :
public static void main(String[] args) throws Exception {
java.net.URI uri = new java.net.URI("http://brokenarrow.huntingtonhelps.com");
String host = uri.getHost();
System.out.println("Host = [" + host + "].");
uri = new java.net.URI("http://mail.yahoo.com");
host = uri.getHost();
System.out.println("Host = [" + host + "].");
}
I don't think it's a bug in Java, I think Java is parsing hostnames correctly according to the spec, there are good explanations of the spec here: http://en.wikipedia.org/wiki/Hostname#Restrictions_on_valid_host_names and here: http://www.netregister.biz/faqit.htm#1
Specifically hostnames MUST NOT contain underscores.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With