I'm trying to get the hash of a lambda function. Why do I get two values (8746164008739 and -9223363290690767077)? Why is the hash from the lambda function not always one value? <pre class="prettyprint"><code>>>> fn = lambda: 1 >>> hash(fn) -9223363290690767077 >>> fn = lambda: 1 >>> hash(fn) 8746164008739 >>> fn = lambda: 1 >>> hash(fn) -9223363290690767077 >>> fn = lambda: 1 >>> hash(fn) 8746164008739 >>> fn = lambda: 1 >>> hash(fn) -9223363290690767077 </code></pre>

Two objects are not guaranteed to hash to the same value unless they compare equal [1]. Python functions (including lambdas) don't compare equal even if they have identical code [2]. For example: <pre class="prettyprint"><code>>>> (lambda: 1) == (lambda: 1) False </code></pre> Implementation-wise, this behaviour is due to the fact that function objects don't provide their own equality operator. Instead, they inherit the default one that uses the object's identity, i.e. its address. From the documentation: <blockquote> If no <code>__cmp__()</code>, <code>__eq__()</code> or <code>__ne__()</code> operation is defined, class instances are compared by object identity (“address”). </blockquote> Here is what happens in your particular example: <pre class="prettyprint"><code>fn = lambda: 1 # New function is allocated at address A and stored in fn. fn = lambda: 1 # New function is allocated at address B and stored in fn. # The function at address A is garbage collected. fn = lambda: 1 # New function is allocated at address A and stored in fn. # The function at address B is garbage collected. fn = lambda: 1 # New function is allocated at address B and stored in fn. # The function at address A is garbage collected. ... </code></pre> Since address <code>A</code> is always hashed to one value, and address <code>B</code> to another, you are seeing <code>hash(fn)</code> alternate between the two values. This alternating behaviour is, however, an implementation artefact and could change one day if, for example, the garbage collector were made to behave slightly differently. The following insightful note has been contributed by @ruakh: <blockquote> It is worth noting that it's not possible to write a general process for determining if two functions are equivalent. (This is a consequence of the undecidability of the halting problem.) Furthermore, two Python functions can behave differently even if their code is identical (since they may be closures referring to distinct-but-identically-named variables). So it makes sense that Python functions don't overload the equality operator: there's no way to implement anything better than the default object-identity comparison. </blockquote> [1] The converse is generally not true: two objects that compare unequal can have the same hash value. This is called a hash collision. [2] Calling your lambdas and then hashing the result would of course always give the same value since <code>hash(1)</code> is always the same within one program: <pre class="prettyprint"><code>>>> (lambda: 1)() == (lambda: 1)() True </code></pre>

The hash of a <code>lambda</code> function object is based on its memory address (in CPython this is what the <code>id</code> function returns). This means that any two function objects will have different hashes (assuming there are no hash collisions), even if the functions contain the same code. To explain what's happening in the question, first note that writing <code>fn = lambda: 1</code> creates a new function object in memory and binds the name <code>fn</code> to it. This new function will therefore have a different hash value to any existing functions. Repeating <code>fn = lambda: 1</code>, you get alternating values for the hashes because when <code>fn</code> is bound to the newly created function object, the function that <code>fn</code> previously pointed to is garbage collected by Python. This is because there are no longer any references to it (since the name <code>fn</code> now points to a different object). The Python interpreter then reuses this old memory address for the next new function object created by writing <code>fn = lambda: 1</code>. This behaviour might vary between different systems and Python implementations.

Hash for lambda function in Python

Tags:

I'm trying to get the hash of a lambda function. Why do I get two values (8746164008739 and -9223363290690767077)? Why is the hash from the lambda function not always one value?

>>> fn = lambda: 1 >>> hash(fn) -9223363290690767077 >>> fn = lambda: 1 >>> hash(fn) 8746164008739 >>> fn = lambda: 1 >>> hash(fn) -9223363290690767077 >>> fn = lambda: 1 >>> hash(fn) 8746164008739 >>> fn = lambda: 1 >>> hash(fn) -9223363290690767077

463

asked Nov 30 '15 12:11

Bogdan Ruzhitskiy

2 Answers

Two objects are not guaranteed to hash to the same value unless they compare equal [1].

Python functions (including lambdas) don't compare equal even if they have identical code [2]. For example:

>>> (lambda: 1) == (lambda: 1) False

Implementation-wise, this behaviour is due to the fact that function objects don't provide their own equality operator. Instead, they inherit the default one that uses the object's identity, i.e. its address. From the documentation:

If no __cmp__(), __eq__() or __ne__() operation is defined, class instances are compared by object identity (“address”).

Here is what happens in your particular example:

fn = lambda: 1  # New function is allocated at address A and stored in fn. fn = lambda: 1  # New function is allocated at address B and stored in fn.                 # The function at address A is garbage collected. fn = lambda: 1  # New function is allocated at address A and stored in fn.                 # The function at address B is garbage collected. fn = lambda: 1  # New function is allocated at address B and stored in fn.                 # The function at address A is garbage collected. ...

Since address A is always hashed to one value, and address B to another, you are seeing hash(fn) alternate between the two values. This alternating behaviour is, however, an implementation artefact and could change one day if, for example, the garbage collector were made to behave slightly differently.

The following insightful note has been contributed by @ruakh:

It is worth noting that it's not possible to write a general process for determining if two functions are equivalent. (This is a consequence of the undecidability of the halting problem.) Furthermore, two Python functions can behave differently even if their code is identical (since they may be closures referring to distinct-but-identically-named variables). So it makes sense that Python functions don't overload the equality operator: there's no way to implement anything better than the default object-identity comparison.

[1] The converse is generally not true: two objects that compare unequal can have the same hash value. This is called a hash collision.

[2] Calling your lambdas and then hashing the result would of course always give the same value since hash(1) is always the same within one program:

>>> (lambda: 1)() == (lambda: 1)() True

199

answered Sep 29 '22 16:09

NPE

The hash of a lambda function object is based on its memory address (in CPython this is what the id function returns). This means that any two function objects will have different hashes (assuming there are no hash collisions), even if the functions contain the same code.

To explain what's happening in the question, first note that writing fn = lambda: 1 creates a new function object in memory and binds the name fn to it. This new function will therefore have a different hash value to any existing functions.

Repeating fn = lambda: 1, you get alternating values for the hashes because when fn is bound to the newly created function object, the function that fn previously pointed to is garbage collected by Python. This is because there are no longer any references to it (since the name fn now points to a different object).

The Python interpreter then reuses this old memory address for the next new function object created by writing fn = lambda: 1.

This behaviour might vary between different systems and Python implementations.

answered Sep 29 '22 17:09

Alex Riley

Related questions
                            
                                Webpack not loading css
                            
                                Angular2 - SEO - how to manipulate the meta description
                            
                                C# Display a Binary Search Tree in Console
                            
                                Changing comment colour in Atom editor
                            
                                Set href in attribute directive in Angular
                            
                                What fonts can I use with pygame.font.Font?
                            
                                React Native: "Auto" width for text node
                            
                                Factoring try catch
                            
                                Sequelize - case-insensitive like
                            
                                Create react app, reload not working
                            
                                Cannot get histogram to show separated bins with vertical lines
                            
                                Android Studio showing Errors(Missing Translation) after Updating

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With