Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Making an object x such that "x in [x]" returns False

Tags:

If we make a pathological potato like this:

>>> class Potato: ...     def __eq__(self, other): ...         return False ...     def __hash__(self): ...         return random.randint(1, 10000) ...  >>> p = Potato() >>> p == p False 

We can break sets and dicts this way (note: it's the same even if __eq__ returns True, it's mucking with the hash that broke them):

>>> p in {p} False >>> p in {p: 0} False 

Also len({p: 0, p: 0}) == 2, and {p: 0}[p] raises KeyError, basically all mapping related stuff goes out the window, as expected.

But what I didn't expect is that we can't break lists

>>> p in [p] True 

Why is that? It seems that list.__contains__ iterates, but it's first checking identity before checking equality. Since it is not the case that identity implies equality (see for example NaN object), what is the reason for lists short-circuiting on identity comparisons?

like image 606
wim Avatar asked Apr 17 '15 06:04

wim


2 Answers

list, tuple, etc., does indeed do an identity check before an equality check, and this behavior is motivated by these invariants:

assert a in [a] assert a in (a,) assert [a].count(a) == 1 for a in container:     assert a in container    # this should ALWAYS be true 

Unfortunately, dicts, sets, and friends operate by hashes, so if you mess with those you can indeed effectively break them.

See this issue and this issue for some history.

like image 123
Ethan Furman Avatar answered Oct 11 '22 13:10

Ethan Furman


In general, breaking the assumption that identity implies equality can break a variety of things in Python. It is true that NaN breaks this assumption, and thus NaN breaks some things in Python. Discussion can be found in this Python bug. In a pre-release version of Python 3.0, reliance on this assumption was removed, but the resolution of the bug was to put it back in (i.e., make Python 3 give the same behavior as Python 2, in which the identity check shortcut is done). The documentation for Python 3 correctly says:

For container types such as list, tuple, set, frozenset, dict, or collections.deque, the expression x in y is equivalent to any(x is e or x == e for e in y).

However, it appears the documentation for Python 2 is incorrect, since it says:

For the list and tuple types, x in y is true if and only if there exists an index i such that x == y[i] is true.

You could raise a documentation bug about this if you want, although it is a pretty esoteric issue so I doubt it will be high on anyone's priority list.

like image 40
BrenBarn Avatar answered Oct 11 '22 13:10

BrenBarn