Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java overriding equals() and hashcode() for two interchangeable integers

I'm overriding the equals and hashcode methods for a simple container object for two ints. Each int reflects the index of another object (it doesn't matter what that object is). The point of the class is to represent a connection between the two objects.

The direction of the connection doesn't matter, therefore the equals method should return true regardless of which way round the two ints are in the object E.g.

connectionA = new Connection(1,2);
connectionB = new Connection(1,3);
connectionC = new Connection(2,1);

connectionA.equals(connectionB); // returns false
connectionA.equals(connectionC); // returns true

Here is what I have (modified from the source code for Integer):

public class Connection {
    // Simple container for two numbers which are connected.
    // Two Connection objects are equal regardless of the order of from and to.

    int from;
    int to;

    public Connection(int from, int to) {
        this.from = from;
        this.to = to;
    }

    // Modifed from Integer source code
    @Override
    public boolean equals(Object obj) {
        if (obj instanceof Connection) {
            Connection connectionObj = (Connection) obj;
            return ((from == connectionObj.from && to == connectionObj.to) || (from == connectionObj.to && to == connectionObj.from));
        }
        return false;
    }

    @Override
    public int hashCode() {
        return from*to;
    }
}

This does work however my question is: Is there a better way to achieve this?

My main worry is with the hashcode() method will return the same hashcode for any two integers which multiply to equal the same number. E.g.

3*4 = 12
2*6 = 12 // same!

The documentation, http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/Object.html#hashCode(), states that

It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.

If anyone can see a simple way of reducing the number of matching hashcodes then I would be appreciative of an answer.

Thanks!

Tim

PS I'm aware that there is a java.sql.Connection which could cause some import annoyances. The object actually has a more specific name in my application but for brevity I shortened it to Connection here.

like image 796
Twice Circled Avatar asked Apr 08 '13 11:04

Twice Circled


People also ask

Which class does override the equals () and hashCode () methods?

All wrapper classes and String class overrides the equals() and hashCode().

Do we need to override both equals and hashCode in Java?

You must override hashCode() in every class that overrides equals(). Failure to do so will result in a violation of the general contract for Object. hashCode(), which will prevent your class from functioning properly in conjunction with all hash-based collections, including HashMap, HashSet, and Hashtable.

Is it mandatory to override hashCode If you override equals method?

If you override the equals(), you MUST also override hashCode(). Otherwise, a violation of the general contract for Object. hashCode() will occur, which results in unexpected behavior when your class is in conjunction with all hash-based collections.


1 Answers

Three solutions that would "work" have been proposed. (By work, I mean that they satisfy the basic requirement of a hashcode ... that different inputs give different outputs ... and they also satisfy the OP's additional "symmetry" requirement.)

These are:

   # 1
   return from ^ to;

   # 2
   return to*to+from*from;

   # 3
   int res = 17;
   res = res * 31 + Math.min(from, to);
   res = res * 31 + Math.max(from, to);
   return res;

The first one has the problem that the range of the output is bounded by the range of the actual input values. So for instance if we assume that the inputs are both non-negative numbers less or equal to 2i and 2j respectively, then the output will be less or equal to 2max(i,j). That is likely to give you poor "dispersion"1 in your hash table ... and a higher rate of collisions. (There is also a problem when from == to!)

The second and third ones are better than the first, but you are still liable to get more collisions than is desirable if from and to are small.


I would suggest a 4th alternative if it is critical that you minimize collisions for small values of from and to.

  #4
  int res = Math.max(from, to);
  res = (res << 16) | (res >>> 16);  // exchange top and bottom 16 bits.
  res = res ^ Math.min(from, to);
  return res;

This has the advantage that if from and to are both in the range 0..216-1, you get a unique hashcode for each distinct (unordered) pair.


1 - I don't know if this is the correct technical term for this ...

like image 150
Stephen C Avatar answered Oct 09 '22 11:10

Stephen C