Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Set subtraction in Python

In my Python code I have this class:

class _Point2D:
    def __init__(self, x, y):
        self.x = x
        self.y = y    

    def __repr__(self):
        return 'point: (' + str(self.x) + ', ' + str(self.y) + ')' 

And there are two lists, initialPointsList and burnedPointsList:

initialPointsList = []
initialPointsList.append(_Point2D(1, 1))
initialPointsList.append(_Point2D(1, 2))
initialPointsList.append(_Point2D(1, 3))
initialPointsList.append(_Point2D(1, 4))
initialPointsList.append(_Point2D(1, 5))
initialPointsList.append(_Point2D(1, 6))
initialPointsList.append(_Point2D(1, 7))

burnedPointsList = []
burnedPointsList.append(_Point2D(1, 2))
burnedPointsList.append(_Point2D(1, 3))

I want to calculate the difference between initialPointsList and burnedPointsList

I have executed:

result = set(initialPointsList) - set(burnedPointsList)
for item in result:
    print item

And get the following output:

point: (1, 1)
point: (1, 4)
point: (1, 5)
point: (1, 6)
point: (1, 2)
point: (1, 3)
point: (1, 7)

But I expected another result, without burned point coordinates:

point: (1, 1)
point: (1, 4)
point: (1, 5)
point: (1, 6)
point: (1, 7)

What is the best way to do that in Python? What is incorrect with my code ?

like image 884
Someone Avatar asked Oct 05 '15 13:10

Someone


2 Answers

If you want this to work correctly, you need to define the __eq__() and __hash__() special methods. If you define __eq__(), it's usually also a good idea to define __ne__().

__eq__() should return True if its arguments are equivalent (their x and y values are the same). __ne__() should do the opposite. It's usually also desirable for __eq__() to do type checking, and return false if the "other" value is not of the same type as self.

__hash__() should return a number. The number should be the same for two values which compare equal with __eq__(), and it's desirable but not strictly required for it to be different for distinct values. A good implementation is this:

def __hash__(self):
    return hash((self.x, self.y))

The tuple hashing algorithm will combine the hash values of its elements in a statistically well-behaved way. You may sometimes see people recommend bitwise XOR (i.e. self.x ^ self.y) here, but that isn't a good idea. That technique throws away all the bits they have in common, which makes for inferior hashing performance (e.g. it always returns zero if self.x == self.y).

Finally, you need to make sure that hash values don't change after an object has been constructed. This is most easily accomplished by converting self.x and self.y into read-only properties.

like image 63
Kevin Avatar answered Sep 22 '22 18:09

Kevin


For completeness, here would be the __eq__, __ne__, and __hash__ methods as mentioned in Kevin's answer.

def __eq__(self, other):
    return type(self) is type(other) and self.x == other.x and self.y == other.y

def __ne__(self, other):
    return not self.__eq__(other)

def __hash__(self):
    return hash((self.x, self.y))

I test it by adding these methods to your class and it produces the expected output:

point: (1, 5)
point: (1, 6)
point: (1, 1)
point: (1, 4)
point: (1, 7)
like image 43
Caleb Mauer Avatar answered Sep 19 '22 18:09

Caleb Mauer