A fundamental part of the Java Object
contract is that the hashCode()
method should be consistent with the equals()
method. This makes sense and is easy to understand: if two objects are "equal" in some way, they should return the same hash code. If not, you could put one object in a HashSet
, for instance, and then later check to see if a separate instance is in the set and incorrectly get back false
, even though the equals()
method would have considered the objects equivalent.
In fact Java's URI code has this problem as of Java 6. Try this code:
import static org.hamcrest.CoreMatchers.*;
import static org.junit.Assert.*;
import java.net.URI;
import org.junit.Test;
public class URITest
{
@Test
public void testURIHashCode()
{
final URI uri1 = URI.create("http://www.example.com/foo%2Abar");
final URI uri2 = URI.create("http://www.example.com/foo%2abar");
assertThat("URIs are not equal.", uri1, equalTo(uri2));
assertThat("Equal URIs do not have same hash code.", uri1.hashCode(), equalTo(uri2.hashCode()));
}
}
URI escape sequences, as per RFC 3968, are case insensitive; that is, %2A
and %2a
are considered equivalent. The Java URI.equals()
implementation takes this into account. However, the URI.hashCode()
implementation does not take this into account! This means that two URI instances that return true
for URI.equals()
nevertheless can return different hash codes, as illustrated in the code above!
I submitted this issue, which supposedly resulting in Java Bug 7134993, but that bug is no longer available. The same issue, though, is shown in Java Bug 7054089. (I'm not sure if this was from my submission or from someone else, but the issue is the same.) However, the bug was denied with the evaluation, "The examples cited are opaque URIs and so the scheme specific parts are not parsed."
Whoever evaluated this bug must have not been familiar with what it means for equals()
and hashCode()
to be consistent. The contract for Object.equals()
clearly states, "If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result." Note the use of "must", not "should".
The point here is that, even though the evaluator claims that the URI is "opaque" and "not parsed", the URI.equals()
implementation (contrary to his/her claims) is indeed parsing the URI and making allowances for case insensitivity. The URI.hashCode()
implementation is not.
So am I being completely dense here and missing something obvious? If I am, someone please enlighten me as to my mistake, and I'll mark your answer as correct. Otherwise, the question is: now that Sun/Oracle no longer seems to allow comments on filed bugs, what recourse do I have to get recognition and action on this fundamental problem in the Java implementation of the primary identifier of the Internet, the URI?
I would resubmit a bug against Java 1.7.0_17 using the example you give here rather than the one in bug 7054089. I checked that your StackOverflow example also holds on that version. I heard that Oracle has closed bug fixing on Java 6 except for security issues.
In your original bug submission, the URIs you gave are opaque URIs and that may have thrown the evaluator off. And I think you mean RFC 2396.
Also, you might get a fresh evaluator :)
Seems to me that they have definitely broken the contract for hashCode() here.
And it's a shame that they don't have a StackOverflow like commentary mechanism for Java (or the basic comments like they had in the past.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With