Why is the 1st one returning null
, while the 2nd one is returning mail.yahoo.com
?
Isn't this weird? If not, what's the logic behind this behavior?
Is the underscore the culprit? Why?
public static void main(String[] args) throws Exception {
java.net.URI uri = new java.net.URI("http://broken_arrow.huntingtonhelps.com");
String host = uri.getHost();
System.out.println("Host = [" + host + "].");
uri = new java.net.URI("http://mail.yahoo.com");
host = uri.getHost();
System.out.println("Host = [" + host + "].");
}
As mentioned in comments by @hsz it is known bug.
But, let's debug and look inside sources of URI
class. The problem is inside the method:
private int parseHostname(int start, int n)
:
parsing first URI fails at lines if ((p < n) && !at(p, n, ':'))
fail("Illegal character in hostname", p);
this is because _
symbol isn't foreseed inside scan block, it allows only alphas, digits and -
symbol (L_ALPHANUM
, H_ALPHANUM
, L_DASH
and H_DASH
).
And yes, this is not fixed yet in Java 7
.
It's because of underscore in base uri. Just Remove underscore to check that out.It's working.
Like given below :
public static void main(String[] args) throws Exception {
java.net.URI uri = new java.net.URI("http://brokenarrow.huntingtonhelps.com");
String host = uri.getHost();
System.out.println("Host = [" + host + "].");
uri = new java.net.URI("http://mail.yahoo.com");
host = uri.getHost();
System.out.println("Host = [" + host + "].");
}
I don't think it's a bug in Java, I think Java is parsing hostnames correctly according to the spec, there are good explanations of the spec here: http://en.wikipedia.org/wiki/Hostname#Restrictions_on_valid_host_names and here: http://www.netregister.biz/faqit.htm#1
Specifically hostnames MUST NOT contain underscores.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With