Suppose we have x = ['a', 'b']
.
What is happening under the hood for the statement:
x not in {None, False}
that raises the unhashable type: 'list'
error?
The workaround I found is to write this instead:
x != None and x!= False
but I am confused because, mathematically, both boolean expressions are equivalent.
Here's what the official documentation states:
[Python 3]: class set([iterable]):
Return a new set or frozenset object whose elements are taken from iterable. The elements of a set must be hashable.
[Python 3]: hashable:
An object is hashable if it has a hash value which never changes during its lifetime (it needs a __hash__() method), and can be compared to other objects (it needs an __eq__() method). Hashable objects which compare equal must have the same hash value.
...
All of Python’s immutable built-in objects are hashable; mutable containers (such as lists or dictionaries) are not.
[Python 3]: object.__contains__(self, item) (just above the anchor):
The membership test operators (in and not in) are normally implemented as an iteration through a sequence. However, container objects can supply the following special method with a more efficient implementation, which also does not require the object be a sequence.
Going into [GitHub]: python/cpython - (v3.5.4) cpython/Objects/setobject.c:
Line #1991:
static PyMethodDef set_methods[] = {
{"add", (PyCFunction)set_add, METH_O,
add_doc},
{"clear", (PyCFunction)set_clear, METH_NOARGS,
clear_doc},
{"__contains__",(PyCFunction)set_direct_contains, METH_O | METH_COEXIST, // @TODO - cfati: MARK THIS LINE
Line #1843:
static PyObject *
set_direct_contains(PySetObject *so, PyObject *key)
{
long result;
result = set_contains(so, key); // @TODO - cfati: MARK THIS LINE
if (result == -1)
return NULL;
return PyBool_FromLong(result);
}
Line #1823:
static int
set_contains(PySetObject *so, PyObject *key)
{
PyObject *tmpkey;
int rv;
rv = set_contains_key(so, key); // @TODO - cfati: MARK THIS LINE
if (rv == -1) {
if (!PySet_Check(key) || !PyErr_ExceptionMatches(PyExc_TypeError))
return -1;
PyErr_Clear();
tmpkey = make_new_set(&PyFrozenSet_Type, key);
if (tmpkey == NULL)
return -1;
rv = set_contains_key(so, tmpkey); // @TODO - cfati: MARK THIS LINE
Py_DECREF(tmpkey);
}
return rv;
}
Line #627:
static int
set_contains_key(PySetObject *so, PyObject *key)
{
setentry entry;
Py_hash_t hash;
if (!PyUnicode_CheckExact(key) ||
(hash = ((PyASCIIObject *) key)->hash) == -1) { // @TODO - cfati: MARK THIS LINE
hash = PyObject_Hash(key);
if (hash == -1)
return -1;
}
entry.key = key;
entry.hash = hash;
return set_contains_entry(so, &entry); // @TODO - cfati: MARK THIS LINE
}
Line #614:
static int
set_contains_entry(PySetObject *so, setentry *entry)
{
PyObject *key;
setentry *lu_entry;
lu_entry = set_lookkey(so, entry->key, entry->hash); // @TODO - cfati: MARK THIS LINE
if (lu_entry == NULL)
return -1;
key = lu_entry->key;
return key != NULL && key != dummy;
}
As seen from the "callstack" (presented in reversed order), in order to test for membership (in
/ not in
), hash is being performed (on all code paths) on the candidate member ("includee"), and since the list instance doesn't have the hash functionality, the interpreter spits out TypeError.
There is a number of ways to get around this (as many others already pointed out most of them):
x = ('a', 'b')
but (generally) these are just ways to get around the problem (this is my personal opinion), since if you end up comparing a list to None and False, the code (that yields that list) could use some refactoring.
if you can enter all elements you want to test in a set
, it means that all unhashable elements don't belong to your set (because you can't put them in)
You could do:
if x.__hash__ and x in {None,False}:
when object isn't hashable, x.__hash__
is None
(other alternatives: Asking "is hashable" about a Python value) and the second part isn't evaluated.
or (better ask forgiveness than permission):
def belongs(x):
try:
return x in {None,False}
except TypeError: # unhashable type
return False
both solutions are faster than using a list
or tuple
((None,False)
), because there's no linear search involved (that is if there are a lot of elements in the test list, not true for only 2 elements)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With