Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do some expressions that reference `x.y` change `id(x.y)`?

Tags:

python

cpython

This question pertains to (at least) CPython 2.7.2 and 3.2.2.

Suppose we define Class and obj as follows.

class Class(object):

    def m(self):
        pass

    @property
    def p(self):
        return None

    @staticmethod
    def s():
        pass

obj = Class()

Short version

Why does the following code output False for each print()?

print(Class.__dict__ is Class.__dict__)
print(Class.__subclasshook__ is Class.__subclasshook__)
print(Class.m is Class.m)

print(obj.__delattr__ is obj.__delattr__)
print(obj.__format__ is obj.__format__)
print(obj.__getattribute__ is obj.__getattribute__)
print(obj.__hash__ is obj.__hash__)
print(obj.__init__ is obj.__init__)
print(obj.__reduce__ is obj.__reduce__)
print(obj.__reduce_ex__ is obj.__reduce_ex__)
print(obj.__repr__ is obj.__repr__)
print(obj.__setattr__ is obj.__setattr__)
print(obj.__sizeof__ is obj.__sizeof__)
print(obj.__str__ is obj.__str__)
print(obj.__subclasshook__ is obj.__subclasshook__)
print(obj.m is obj.m)

(That's for Python 2; for Python 3, omit the print() for Class.m and add similar print()s for obj.__eq__, obj.__ge__, obj.__gt__, obj.__le__, obj.__lt__, and obj.__ne__)

Why, on the other hand, does the following code output True for each print()?

print(Class.__class__ is Class.__class__)
print(Class.__delattr__ is Class.__delattr__)
print(Class.__doc__ is Class.__doc__)
print(Class.__format__ is Class.__format__)
print(Class.__getattribute__ is Class.__getattribute__)
print(Class.__hash__ is Class.__hash__)
print(Class.__init__ is Class.__init__)
print(Class.__module__ is Class.__module__)
print(Class.__new__ is Class.__new__)
print(Class.__reduce__ is Class.__reduce__)
print(Class.__reduce_ex__ is Class.__reduce_ex__)
print(Class.__repr__ is Class.__repr__)
print(Class.__setattr__ is Class.__setattr__)
print(Class.__sizeof__ is Class.__sizeof__)
print(Class.__str__ is Class.__str__)
print(Class.__weakref__ is Class.__weakref__)
print(Class.p is Class.p)
print(Class.s is Class.s)

print(obj.__class__ is obj.__class__)
print(obj.__dict__ is obj.__dict__)
print(obj.__doc__ is obj.__doc__)
print(obj.__module__ is obj.__module__)
print(obj.__new__ is obj.__new__)
print(obj.__weakref__ is obj.__weakref__)
print(obj.p is obj.p)
print(obj.s is obj.s)

(That's for Python 2; for Python 3, add similar print()s for Class.__eq__, Class.__ge__, Class.__gt__, Class.__le__, Class.__lt__, and Class.__ne__, and Class.m)

Long version

If we ask for id(obj.m) twice in a row, we (unsurprisingly) get the same object ID twice.

>>> id(obj.m)
139675714789856
>>> id(obj.m)
139675714789856

However, if we ask for id(obj.m), then evaluate some expressions that reference obj.m, then ask for id(obj.m) again, we sometimes (but not always) find that the object ID has changed. Among the situations where it changes, in some of those, asking for id(obj.m) once more causes the ID to change back to the original value. In those cases where it doesn't change back, repeating the expressions between the id(obj.m) calls apparently causes the ID to alternate between the two observed values.

Here are some examples where the object ID doesn't change:

>>> print(obj.m); id(obj.m)
<bound method Class.m of <__main__.Class object at 0x7f08c96058d0>>
139675714789856
>>> obj.m is None; id(obj.m)
False
139675714789856
>>> obj.m.__func__.__name__; id(obj.m)
'm'
139675714789856
>>> obj.m(); id(obj.m)
139675714789856

Here is an example where the object ID changes, then changes back:

>>> obj.m; id(obj.m); id(obj.m)
<bound method Class.m of <__main__.Class object at 0x7f08c96058d0>>
139675715407536
139675714789856

Here is an example where the object ID changes, then doesn't change back:

>>> obj.m is obj.m; id(obj.m); id(obj.m)
False
139675715407536
139675715407536

Here is the same example, with the operant expression repeated a few times to demonstrate the alternating behavior:

>>> obj.m is obj.m; id(obj.m); id(obj.m)
False
139675714789856
139675714789856
>>> obj.m is obj.m; id(obj.m); id(obj.m)
False
139675715407536
139675715407536
>>> obj.m is obj.m; id(obj.m); id(obj.m)
False
139675714789856
139675714789856

Thus, the entire question consists of the following parts:

  • What kinds of attributes might change their identity as a side effect of expressions that do not modify those attributes?

  • What kinds of expressions trigger such changes?

  • What is the mechanism that causes such changes?

  • Under what conditions are the past identities recycled?

  • Why isn't the first identity recycled indefinitely, which would avoid all of this complication?

  • Is any of this documented?

like image 741
nisavid Avatar asked Apr 18 '12 23:04

nisavid


People also ask

How do you identify the factors of the terms of an expression?

What are the Factors of a Term? The numbers or variables that are multiplied to form a term are called its factors. For example, 5xy is a term with factors 5, x and y. The factors cannot be further factorized.

What is difference between polynomial and algebraic expression?

But all polynomials are algebraic expressions. The difference is polynomials include only variables and coefficients with mathematical operations(+, -, ×) but algebraic expressions include irrational numbers in the powers as well.

What are the three parts of a mathematical expression?

An expression has 3 parts: constant, variable, and term. There are 3 types of expressions: arithmetic/numerical, fractional, and algebraic. Polynomial is a type of variable expression.


1 Answers

what kinds of attributes might change their identity as a side effect of expressions that do not modify those attributes?

Properties, or more precisely objects that implement the descriptor protocol. For example, Class.__dict__ is not a dict but a dictproxy. Clearly this object is generated anew each time it is requested. Why? Probably to cut down on the overhead of creating the object until it is necessary to do so. However, this is an implementation detail. The important thing is that __dict__ works as documented.

Even ordinary instance methods are handled using descriptors, which explains why obj.m is not obj.m. Interestingly, if you do obj.m = obj.m you permanently store that method wrapper on the instance, and then obj.m is obj.m. :-)

what kinds of expressions trigger such changes?

Any access to an attribute can trigger the __get__() method of a descriptor, and this method can always return the same object or return a different one each time.

what is the mechanism that causes such changes?

Properties/descriptors.

under what conditions are the past identities recycled?

Not sure what you mean by "recycled." You mean "disposed of" or "reused"? In CPython, the id of an object is its memory location. If two objects end up at the same memory location at different times, they will have the same id. Therefore, two references that have the same the same id at different times (even within a single statement) are not necessarily the same object. Other Python implementations use different rules for generating ids. For example, I believe Jython uses incrementing integers, which provide more clarity into object identity.

why isn't the first identity recycled indefinitely, which would avoid all of this complication?

Presumably there was some advantage to using descriptors. The source code for the Python interpreter is available; look at that if you want to know more details.

is any of this documented?

No. These are implementation-specific details of the CPython interpreter and should not be relied upon. Other Python implementations (including future versions of CPython) may, and most likely will, behave differently. There are significant differences between 2.x and 3.x CPython, for example.

like image 129
kindall Avatar answered Oct 29 '22 22:10

kindall