This question pertains to (at least) CPython 2.7.2 and 3.2.2. Suppose we define <code>Class</code> and <code>obj</code> as follows. <pre class="prettyprint lang-python prettyprint-override"><code>class Class(object): def m(self): pass @property def p(self): return None @staticmethod def s(): pass obj = Class() </code></pre> <h3>Short version</h3> Why does the following code output <code>False</code> for each <code>print()</code>? <pre class="prettyprint lang-python prettyprint-override"><code>print(Class.__dict__ is Class.__dict__) print(Class.__subclasshook__ is Class.__subclasshook__) print(Class.m is Class.m) print(obj.__delattr__ is obj.__delattr__) print(obj.__format__ is obj.__format__) print(obj.__getattribute__ is obj.__getattribute__) print(obj.__hash__ is obj.__hash__) print(obj.__init__ is obj.__init__) print(obj.__reduce__ is obj.__reduce__) print(obj.__reduce_ex__ is obj.__reduce_ex__) print(obj.__repr__ is obj.__repr__) print(obj.__setattr__ is obj.__setattr__) print(obj.__sizeof__ is obj.__sizeof__) print(obj.__str__ is obj.__str__) print(obj.__subclasshook__ is obj.__subclasshook__) print(obj.m is obj.m) </code></pre> (That's for Python 2; for Python 3, omit the <code>print()</code> for <code>Class.m</code> and add similar <code>print()</code>s for <code>obj.__eq__</code>, <code>obj.__ge__</code>, <code>obj.__gt__</code>, <code>obj.__le__</code>, <code>obj.__lt__</code>, and <code>obj.__ne__</code>) Why, on the other hand, does the following code output <code>True</code> for each <code>print()</code>? <pre class="prettyprint lang-python prettyprint-override"><code>print(Class.__class__ is Class.__class__) print(Class.__delattr__ is Class.__delattr__) print(Class.__doc__ is Class.__doc__) print(Class.__format__ is Class.__format__) print(Class.__getattribute__ is Class.__getattribute__) print(Class.__hash__ is Class.__hash__) print(Class.__init__ is Class.__init__) print(Class.__module__ is Class.__module__) print(Class.__new__ is Class.__new__) print(Class.__reduce__ is Class.__reduce__) print(Class.__reduce_ex__ is Class.__reduce_ex__) print(Class.__repr__ is Class.__repr__) print(Class.__setattr__ is Class.__setattr__) print(Class.__sizeof__ is Class.__sizeof__) print(Class.__str__ is Class.__str__) print(Class.__weakref__ is Class.__weakref__) print(Class.p is Class.p) print(Class.s is Class.s) print(obj.__class__ is obj.__class__) print(obj.__dict__ is obj.__dict__) print(obj.__doc__ is obj.__doc__) print(obj.__module__ is obj.__module__) print(obj.__new__ is obj.__new__) print(obj.__weakref__ is obj.__weakref__) print(obj.p is obj.p) print(obj.s is obj.s) </code></pre> (That's for Python 2; for Python 3, add similar <code>print()</code>s for <code>Class.__eq__</code>, <code>Class.__ge__</code>, <code>Class.__gt__</code>, <code>Class.__le__</code>, <code>Class.__lt__</code>, and <code>Class.__ne__</code>, and <code>Class.m</code>) <h3>Long version</h3> If we ask for <code>id(obj.m)</code> twice in a row, we (unsurprisingly) get the same object ID twice. <pre class="prettyprint lang-python prettyprint-override"><code>>>> id(obj.m) 139675714789856 >>> id(obj.m) 139675714789856 </code></pre> However, if we ask for <code>id(obj.m)</code>, then evaluate some expressions that reference <code>obj.m</code>, then ask for <code>id(obj.m)</code> again, we sometimes (but not always) find that the object ID has changed. Among the situations where it changes, in some of those, asking for <code>id(obj.m)</code> once more causes the ID to change back to the original value. In those cases where it doesn't change back, repeating the expressions between the <code>id(obj.m)</code> calls apparently causes the ID to alternate between the two observed values. Here are some examples where the object ID doesn't change: <pre class="prettyprint lang-python prettyprint-override"><code>>>> print(obj.m); id(obj.m) <bound method Class.m of <__main__.Class object at 0x7f08c96058d0>> 139675714789856 >>> obj.m is None; id(obj.m) False 139675714789856 >>> obj.m.__func__.__name__; id(obj.m) 'm' 139675714789856 >>> obj.m(); id(obj.m) 139675714789856 </code></pre> Here is an example where the object ID changes, then changes back: <pre class="prettyprint lang-python prettyprint-override"><code>>>> obj.m; id(obj.m); id(obj.m) <bound method Class.m of <__main__.Class object at 0x7f08c96058d0>> 139675715407536 139675714789856 </code></pre> Here is an example where the object ID changes, then doesn't change back: <pre class="prettyprint lang-python prettyprint-override"><code>>>> obj.m is obj.m; id(obj.m); id(obj.m) False 139675715407536 139675715407536 </code></pre> Here is the same example, with the operant expression repeated a few times to demonstrate the alternating behavior: <pre class="prettyprint lang-python prettyprint-override"><code>>>> obj.m is obj.m; id(obj.m); id(obj.m) False 139675714789856 139675714789856 >>> obj.m is obj.m; id(obj.m); id(obj.m) False 139675715407536 139675715407536 >>> obj.m is obj.m; id(obj.m); id(obj.m) False 139675714789856 139675714789856 </code></pre> Thus, the entire question consists of the following parts: <ul> <li>What kinds of attributes might change their identity as a side effect of expressions that do not modify those attributes?</li> <li>What kinds of expressions trigger such changes?</li> <li>What is the mechanism that causes such changes?</li> <li>Under what conditions are the past identities recycled?</li> <li>Why isn't the first identity recycled indefinitely, which would avoid all of this complication?</li> <li>Is any of this documented?</li> </ul>

<blockquote> what kinds of attributes might change their identity as a side effect of expressions that do not modify those attributes? </blockquote> Properties, or more precisely objects that implement the descriptor protocol. For example, <code>Class.__dict__</code> is not a <code>dict</code> but a <code>dictproxy</code>. Clearly this object is generated anew each time it is requested. Why? Probably to cut down on the overhead of creating the object until it is necessary to do so. However, this is an implementation detail. The important thing is that <code>__dict__</code> works as documented. Even ordinary instance methods are handled using descriptors, which explains why <code>obj.m is not obj.m</code>. Interestingly, if you do <code>obj.m = obj.m</code> you permanently store that method wrapper on the instance, and then <code>obj.m is obj.m</code>. :-) <blockquote> what kinds of expressions trigger such changes? </blockquote> Any access to an attribute can trigger the <code>__get__()</code> method of a descriptor, and this method can always return the same object or return a different one each time. <blockquote> what is the mechanism that causes such changes? </blockquote> Properties/descriptors. <blockquote> under what conditions are the past identities recycled? </blockquote> Not sure what you mean by "recycled." You mean "disposed of" or "reused"? In CPython, the <code>id</code> of an object is its memory location. If two objects end up at the same memory location at different times, they will have the same <code>id</code>. Therefore, two references that have the same the same <code>id</code> at different times (even within a single statement) are not necessarily the same object. Other Python implementations use different rules for generating <code>id</code>s. For example, I believe Jython uses incrementing integers, which provide more clarity into object identity. <blockquote> why isn't the first identity recycled indefinitely, which would avoid all of this complication? </blockquote> Presumably there was some advantage to using descriptors. The source code for the Python interpreter is available; look at that if you want to know more details. <blockquote> is any of this documented? </blockquote> No. These are implementation-specific details of the CPython interpreter and should not be relied upon. Other Python implementations (including future versions of CPython) may, and most likely will, behave differently. There are significant differences between 2.x and 3.x CPython, for example.

Why do some expressions that reference `x.y` change `id(x.y)`?

Tags:

python

cpython

This question pertains to (at least) CPython 2.7.2 and 3.2.2.

Suppose we define Class and obj as follows.

class Class(object):

    def m(self):
        pass

    @property
    def p(self):
        return None

    @staticmethod
    def s():
        pass

obj = Class()

Short version

Why does the following code output False for each print()?

print(Class.__dict__ is Class.__dict__)
print(Class.__subclasshook__ is Class.__subclasshook__)
print(Class.m is Class.m)

print(obj.__delattr__ is obj.__delattr__)
print(obj.__format__ is obj.__format__)
print(obj.__getattribute__ is obj.__getattribute__)
print(obj.__hash__ is obj.__hash__)
print(obj.__init__ is obj.__init__)
print(obj.__reduce__ is obj.__reduce__)
print(obj.__reduce_ex__ is obj.__reduce_ex__)
print(obj.__repr__ is obj.__repr__)
print(obj.__setattr__ is obj.__setattr__)
print(obj.__sizeof__ is obj.__sizeof__)
print(obj.__str__ is obj.__str__)
print(obj.__subclasshook__ is obj.__subclasshook__)
print(obj.m is obj.m)

(That's for Python 2; for Python 3, omit the print() for Class.m and add similar print()s for obj.__eq__, obj.__ge__, obj.__gt__, obj.__le__, obj.__lt__, and obj.__ne__)

Why, on the other hand, does the following code output True for each print()?

print(Class.__class__ is Class.__class__)
print(Class.__delattr__ is Class.__delattr__)
print(Class.__doc__ is Class.__doc__)
print(Class.__format__ is Class.__format__)
print(Class.__getattribute__ is Class.__getattribute__)
print(Class.__hash__ is Class.__hash__)
print(Class.__init__ is Class.__init__)
print(Class.__module__ is Class.__module__)
print(Class.__new__ is Class.__new__)
print(Class.__reduce__ is Class.__reduce__)
print(Class.__reduce_ex__ is Class.__reduce_ex__)
print(Class.__repr__ is Class.__repr__)
print(Class.__setattr__ is Class.__setattr__)
print(Class.__sizeof__ is Class.__sizeof__)
print(Class.__str__ is Class.__str__)
print(Class.__weakref__ is Class.__weakref__)
print(Class.p is Class.p)
print(Class.s is Class.s)

print(obj.__class__ is obj.__class__)
print(obj.__dict__ is obj.__dict__)
print(obj.__doc__ is obj.__doc__)
print(obj.__module__ is obj.__module__)
print(obj.__new__ is obj.__new__)
print(obj.__weakref__ is obj.__weakref__)
print(obj.p is obj.p)
print(obj.s is obj.s)

(That's for Python 2; for Python 3, add similar print()s for Class.__eq__, Class.__ge__, Class.__gt__, Class.__le__, Class.__lt__, and Class.__ne__, and Class.m)

Long version

If we ask for id(obj.m) twice in a row, we (unsurprisingly) get the same object ID twice.

>>> id(obj.m)
139675714789856
>>> id(obj.m)
139675714789856

However, if we ask for id(obj.m), then evaluate some expressions that reference obj.m, then ask for id(obj.m) again, we sometimes (but not always) find that the object ID has changed. Among the situations where it changes, in some of those, asking for id(obj.m) once more causes the ID to change back to the original value. In those cases where it doesn't change back, repeating the expressions between the id(obj.m) calls apparently causes the ID to alternate between the two observed values.

Here are some examples where the object ID doesn't change:

>>> print(obj.m); id(obj.m)
<bound method Class.m of <__main__.Class object at 0x7f08c96058d0>>
139675714789856
>>> obj.m is None; id(obj.m)
False
139675714789856
>>> obj.m.__func__.__name__; id(obj.m)
'm'
139675714789856
>>> obj.m(); id(obj.m)
139675714789856

Here is an example where the object ID changes, then changes back:

>>> obj.m; id(obj.m); id(obj.m)
<bound method Class.m of <__main__.Class object at 0x7f08c96058d0>>
139675715407536
139675714789856

Here is an example where the object ID changes, then doesn't change back:

>>> obj.m is obj.m; id(obj.m); id(obj.m)
False
139675715407536
139675715407536

Here is the same example, with the operant expression repeated a few times to demonstrate the alternating behavior:

>>> obj.m is obj.m; id(obj.m); id(obj.m)
False
139675714789856
139675714789856
>>> obj.m is obj.m; id(obj.m); id(obj.m)
False
139675715407536
139675715407536
>>> obj.m is obj.m; id(obj.m); id(obj.m)
False
139675714789856
139675714789856

Thus, the entire question consists of the following parts:

What kinds of attributes might change their identity as a side effect of expressions that do not modify those attributes?
What kinds of expressions trigger such changes?
What is the mechanism that causes such changes?
Under what conditions are the past identities recycled?
Why isn't the first identity recycled indefinitely, which would avoid all of this complication?
Is any of this documented?

741

asked Apr 18 '12 23:04

nisavid

1 Answers

what kinds of attributes might change their identity as a side effect of expressions that do not modify those attributes?

Properties, or more precisely objects that implement the descriptor protocol. For example, Class.__dict__ is not a dict but a dictproxy. Clearly this object is generated anew each time it is requested. Why? Probably to cut down on the overhead of creating the object until it is necessary to do so. However, this is an implementation detail. The important thing is that __dict__ works as documented.

Even ordinary instance methods are handled using descriptors, which explains why obj.m is not obj.m. Interestingly, if you do obj.m = obj.m you permanently store that method wrapper on the instance, and then obj.m is obj.m. :-)

what kinds of expressions trigger such changes?

Any access to an attribute can trigger the __get__() method of a descriptor, and this method can always return the same object or return a different one each time.

what is the mechanism that causes such changes?

Properties/descriptors.

under what conditions are the past identities recycled?

Not sure what you mean by "recycled." You mean "disposed of" or "reused"? In CPython, the id of an object is its memory location. If two objects end up at the same memory location at different times, they will have the same id. Therefore, two references that have the same the same id at different times (even within a single statement) are not necessarily the same object. Other Python implementations use different rules for generating ids. For example, I believe Jython uses incrementing integers, which provide more clarity into object identity.

why isn't the first identity recycled indefinitely, which would avoid all of this complication?

Presumably there was some advantage to using descriptors. The source code for the Python interpreter is available; look at that if you want to know more details.

is any of this documented?

No. These are implementation-specific details of the CPython interpreter and should not be relied upon. Other Python implementations (including future versions of CPython) may, and most likely will, behave differently. There are significant differences between 2.x and 3.x CPython, for example.

129

answered Oct 29 '22 22:10

kindall

Related questions
                            
                                Gevent threads don't finish even though all the Queue items are exhausted
                            
                                What's the most efficient way to zip two nested list to a single level dictionary
                            
                                Paramaterize unit tests in python
                            
                                sudo required for easy_install pip in OS X Lion?
                            
                                Make an animated wave with drawPolyline in PySide/PyQt
                            
                                PUT request to upload a file not working in Flask
                            
                                String formatting in Python: Showing a price without decimal points
                            
                                Error in opening image file in PIL
                            
                                How to run multiple commands in one process using Popen?
                            
                                Python JSON and Unicode
                            
                                What is "instrumentation" in the context of SQLAlchemy?
                            
                                How do I tell pandas to parse a particular column as a datetime object, but not make it an index?
                            
                                How to enumerate modules in python 64bit
                            
                                Python Modulus Giving String Formatting Errors
                            
                                OS starts killing processes when multi-threaded python process runs
                            
                                Indexing with boolean arrays into multidimensional arrays using numpy
                            
                                Failed to write to file but generates no Error
                            
                                Route celery task to specific queue
                            
                                py2exe "include" modules: when should they be managed manually?
                            
                                python nose xunit report file is empty

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With