Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python 2: different meaning of the 'in' keyword for sets and lists

Consider this snippet:

class SomeClass(object):

    def __init__(self, someattribute="somevalue"):
        self.someattribute = someattribute

    def __eq__(self, other):
        return self.someattribute == other.someattribute

    def __ne__(self, other):
        return not self.__eq__(other)

list_of_objects = [SomeClass()]
print(SomeClass() in list_of_objects)

set_of_objects = set([SomeClass()])
print(SomeClass() in set_of_objects)

which evaluates to:

True
False

Can anyone explain why the 'in' keyword has a different meaning for sets and lists? I would have expected both to return True, especially when the type being tested has equality methods defined.

like image 253
mskel Avatar asked Feb 13 '12 04:02

mskel


People also ask

What is difference between list and set in Python?

Lists and tuples are standard Python data types that store values in a sequence. Sets are another standard Python data type that also store values. The major difference is that sets, unlike lists or tuples, cannot have multiple occurrences of the same element and store unordered values.

How does the in keyword work Python?

The in keyword has two purposes: To check if a value is present in a list, tuple, range, string, etc. To iterate through a sequence in a for loop.

What does set () do in Python?

set() method is used to convert any of the iterable to sequence of iterable elements with distinct elements, commonly called Set. Parameters : Any iterable sequence like list, tuple or dictionary. Returns : An empty set if no element is passed.

What is unique about a set in Python?

A set is an unordered collection of items. Every set element is unique (no duplicates) and must be immutable (cannot be changed). However, a set itself is mutable. We can add or remove items from it.


2 Answers

The meaning is the same, but the implementation is different. Lists simply examine each object, checking for equality, so it works for your class. Sets first hash the objects, and if they don't implement hash properly, the set appears not to work.

Your class defines __eq__, but doesn't define __hash__, and so won't work properly for sets or as keys of dictionaries. The rule for __eq__ and __hash__ is that two objects that __eq__ as True must also have equal hashes. By default, objects hash based on their memory address. So your two objects that are equal by your definition don't provide the same hash, so they break the rule about __eq__ and __hash__.

If you provide a __hash__ implementation, it will work fine. For your sample code, it could be:

def __hash__(self):
    return hash(self.someattribute)
like image 105
Ned Batchelder Avatar answered Oct 20 '22 18:10

Ned Batchelder


In pretty much any hashtable implementation, including Python's, if you override the equality method you must override the hashing method (in Python, this is __hash__). The in operator for lists just checks equality with every element of the list, which the in operator for sets first hashes the object you are looking for, checks for an object in that slot of the hashtable, and then checks for equality if there is anything in the slot. So, if you override __eq__ without overriding __hash__, you cannot be guaranteed that the in operator for sets will check in the right slot.

like image 32
Adam Mihalcin Avatar answered Oct 20 '22 18:10

Adam Mihalcin