Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Two instances of class are equal but different hash code

Tags:

python

I'm working on a geometry in space project and I have different geometrical entities, among which Point. Sometimes two points are equal but for small numerical errors due to calculation, such as 1 and 1.0000000001, so I implemented the __eq__ method with math.isclose() function to sort this things out.

class Point(object):

    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z

    def __eq__(self, other):
        if isinstance(other, Point):
            equal_x = math.isclose(other.x, self.x, rel_tol=5e-3, abs_tol=1e-5)
            equal_y = math.isclose(other.y, self.y, rel_tol=5e-3, abs_tol=1e-5)
            equal_z = math.isclose(other.z, self.z, rel_tol=5e-3, abs_tol=1e-5)
            if equal_x and equal_y and equal_z:
                return True

        return False   

How to implement the __hash__ method in order for such two objects to be equal when using sets and dictionaries?

The ultimate goal is to use the following function to "uniquify" a list of such object and remove duplicates:

def f12(seq):
    # from Raymond Hettinger
    # https://twitter.com/raymondh/status/944125570534621185
    return list(dict.fromkeys(seq))
like image 608
Enrico Agostini Avatar asked Mar 10 '20 16:03

Enrico Agostini


People also ask

Can two classes have same hash code?

1) If two objects are equal (i.e. the equals() method returns true), they must have the same hashcode. 2) If the hashCode() method is called multiple times on the same object, it must return the same result every time. 3) Two different objects can have the same hash code.

Can two objects which are not equal have the same hash code?

Unequal objects must have different hash codes – WRONG! Objects with the same hash code must be equal – WRONG!

How you write Hashcode What if it returns different Hashcode for same object?

Whenever it(hashcode) is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified.


1 Answers

There is a general problem with your equal method.

Most people will expect transitivity in a equal method. This means that when two points a and b are equal and a is also equal to an other point c, then a and c should be also equal. That doesn't have to be the case in your implementation.

Imagine each point, have a sphere around him, where the points are considered equal. The following drawing shows this spheres (or better half the radius of the sphere) and overlapping therefore implies that the points are equal:

point visualization

So a and b should have the same hash code and b and c should have the same hash code, but not a and c? How should this be possible?

I would propose to add an extra method is_close_to and implement the logic there.

Edit: @JLPeyret points out, that you can use a grid and compute the hash value of a point corresponding to the quadrant of the grid that contains the point. In this case it is possible, that two near points are close to the division of a grid quadrant and therefore assigned with different hash values. If such a probabilistic approach works for you, take a look at locality-sensitive hashing.

like image 171
Querenker Avatar answered Oct 21 '22 23:10

Querenker