Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hash for lambda function in Python

Tags:

I'm trying to get the hash of a lambda function. Why do I get two values (8746164008739 and -9223363290690767077)? Why is the hash from the lambda function not always one value?

>>> fn = lambda: 1 >>> hash(fn) -9223363290690767077 >>> fn = lambda: 1 >>> hash(fn) 8746164008739 >>> fn = lambda: 1 >>> hash(fn) -9223363290690767077 >>> fn = lambda: 1 >>> hash(fn) 8746164008739 >>> fn = lambda: 1 >>> hash(fn) -9223363290690767077 
like image 463
Bogdan Ruzhitskiy Avatar asked Nov 30 '15 12:11

Bogdan Ruzhitskiy


People also ask

What is hash () function in Python?

Python hash() method Python hash() function is a built-in function and returns the hash value of an object if it has one. The hash value is an integer which is used to quickly compare dictionary keys while looking at a dictionary.

What is __ hash __ Python?

The hash() function accepts an object and returns the hash value as an integer. When you pass an object to the hash() function, Python will execute the __hash__ special method of the object. It means that when you pass the p1 object to the hash() function: hash(p1) Code language: Python (python)

Are lambda functions hashable?

Hence, lambda functions are hashable.


2 Answers

Two objects are not guaranteed to hash to the same value unless they compare equal [1].

Python functions (including lambdas) don't compare equal even if they have identical code [2]. For example:

>>> (lambda: 1) == (lambda: 1) False 

Implementation-wise, this behaviour is due to the fact that function objects don't provide their own equality operator. Instead, they inherit the default one that uses the object's identity, i.e. its address. From the documentation:

If no __cmp__(), __eq__() or __ne__() operation is defined, class instances are compared by object identity (“address”).

Here is what happens in your particular example:

fn = lambda: 1  # New function is allocated at address A and stored in fn. fn = lambda: 1  # New function is allocated at address B and stored in fn.                 # The function at address A is garbage collected. fn = lambda: 1  # New function is allocated at address A and stored in fn.                 # The function at address B is garbage collected. fn = lambda: 1  # New function is allocated at address B and stored in fn.                 # The function at address A is garbage collected. ... 

Since address A is always hashed to one value, and address B to another, you are seeing hash(fn) alternate between the two values. This alternating behaviour is, however, an implementation artefact and could change one day if, for example, the garbage collector were made to behave slightly differently.

The following insightful note has been contributed by @ruakh:

It is worth noting that it's not possible to write a general process for determining if two functions are equivalent. (This is a consequence of the undecidability of the halting problem.) Furthermore, two Python functions can behave differently even if their code is identical (since they may be closures referring to distinct-but-identically-named variables). So it makes sense that Python functions don't overload the equality operator: there's no way to implement anything better than the default object-identity comparison.

[1] The converse is generally not true: two objects that compare unequal can have the same hash value. This is called a hash collision.

[2] Calling your lambdas and then hashing the result would of course always give the same value since hash(1) is always the same within one program:

>>> (lambda: 1)() == (lambda: 1)() True 
like image 199
NPE Avatar answered Sep 29 '22 16:09

NPE


The hash of a lambda function object is based on its memory address (in CPython this is what the id function returns). This means that any two function objects will have different hashes (assuming there are no hash collisions), even if the functions contain the same code.

To explain what's happening in the question, first note that writing fn = lambda: 1 creates a new function object in memory and binds the name fn to it. This new function will therefore have a different hash value to any existing functions.

Repeating fn = lambda: 1, you get alternating values for the hashes because when fn is bound to the newly created function object, the function that fn previously pointed to is garbage collected by Python. This is because there are no longer any references to it (since the name fn now points to a different object).

The Python interpreter then reuses this old memory address for the next new function object created by writing fn = lambda: 1.

This behaviour might vary between different systems and Python implementations.

like image 20
Alex Riley Avatar answered Sep 29 '22 17:09

Alex Riley