Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Class with changing __hash__ still works with dictionary access

What I did is obviously not something that one would want to do, rather, I was just testing out implementing __hash__ for a given class.

I wanted to see if adding a phony 'hashable' class to a dictionary, then changing it's hash value would then result in it not being able to access it.

My class looks like this:

class PhonyHash:

    def __hash__(self):
        val = list("A string")
        return id(val)  # always different

Executing the following in my IPython console:

>>> p = PhonyHash()
>>> d = { p: "a value"}
>>> hash(p)  # changes hash

and then trying to access the element with d[p] works:

>>> d[p]
"a value"

I understand, this is not something that should be done, I'm really just curious as to why it works. Doesn't dict use the hash() of an object to store/retrieve it? Why is this working?

edit: as noted in the comments by @VPfB sets behave as expected, for some reason:

>>> p = PhonyHash()
>>> s = {p}
>>> p in s
False
like image 848
root Avatar asked Jul 14 '16 12:07

root


1 Answers

This is a strange bit of fate. Several bits of CPython machinery have thwarted you. The three issues at play are:

  1. The initial size of the array that backs a dict is 8 [1]
  2. All objects in CPython have memory addresses that are modulo 8 [2]
  3. The dict class has an optimisation that checks keys are the same object and stops there if true (otherwise it will check if they are equal according to the __eq__ method) [3]

What this means is that despite your object always producing different hash values, the first slot of the backing array that will be examined is the same. If it wasn't you'd get a key error because the slot was empty. The dict then decides it has the right key because it has the exact same object, not just an equal object.

class PhonyHash:
    _hash = 1
    def __hash__(self):
        return self._hash

p = PhonyHash()
d = {p: "val"}
print(p in d) # True
p._hash = 2
print(p in d) # False
p._hash = 9 # 9 % 8 == 1
print(p in d) # True

Links to CPython sources

  1. The dict struct defines ma_table, which starts as ma_smalltable, which is of length PyDict_MinSize.
  2. This is documented in Objects/obmalloc.c
  3. Can be seen in the lookup function here and here
like image 151
Dunes Avatar answered Oct 15 '22 14:10

Dunes