Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can't I change the __class__ attribute of an instance of object?

class A(object):
    pass

class B(A):
    pass

o = object()
a = A()
b = B()

While I can change a.__class__, I can't do the same with o.__class__ (it throws a TypeError error). Why?

For example:

isinstance(a, A) # True
isinstance(a, B) # False
a.__class__ = B
isinstance(a, A) # True
isinstance(a, B) # True

isinstance(o, object) # True
isinstance(o, A) # False
o.__class__ = A # This fails and throws a TypeError
# isinstance(o, object)
# isinstance(o, A)

I know this generally isn’t a good idea, since it can lead to some very strange behaviour if it is handled incorrectly. It's just for the sake of curiosity.

like image 272
Riccardo Bucco Avatar asked Dec 04 '19 11:12

Riccardo Bucco


Video Answer


2 Answers

CPython has a comment in Objects/typeobject.c on this topic:

In versions of CPython prior to 3.5, the code in compatible_for_assignment was not set up to correctly check for memory layout / slot / etc. compatibility for non-HEAPTYPE classes, so we just disallowed __class__ assignment in any case that wasn't HEAPTYPE -> HEAPTYPE.

During the 3.5 development cycle, we fixed the code in compatible_for_assignment to correctly check compatibility between arbitrary types, and started allowing __class__ assignment in all cases where the old and new types did in fact have compatible slots and memory layout (regardless of whether they were implemented as HEAPTYPEs or not).

Just before 3.5 was released, though, we discovered that this led to problems with immutable types like int, where the interpreter assumes they are immutable and interns some values. Formerly this wasn't a problem, because they really were immutable -- in particular, all the types where the interpreter applied this interning trick happened to also be statically allocated, so the old HEAPTYPE rules were "accidentally" stopping them from allowing __class__ assignment. But with the changes to __class__ assignment, we started allowing code like

class MyInt(int):
#   ...
# Modifies the type of *all* instances of 1 in the whole program,
# including future instances (!), because the 1 object is interned.
 (1).__class__ = MyInt

(see https://bugs.python.org/issue24912).

In theory the proper fix would be to identify which classes rely on this invariant and somehow disallow __class__ assignment only for them, perhaps via some mechanism like a new Py_TPFLAGS_IMMUTABLE flag (a "blacklisting" approach). But in practice, since this problem wasn't noticed late in the 3.5 RC cycle, we're taking the conservative approach and reinstating the same HEAPTYPE->HEAPTYPE check that we used to have, plus a "whitelist". For now, the whitelist consists only of ModuleType subtypes, since those are the cases that motivated the patch in the first place -- see https://bugs.python.org/issue22986 -- and since module objects are mutable we can be sure that they are definitely not being interned. So now we allow HEAPTYPE->HEAPTYPE or ModuleType subtype -> ModuleType subtype.

So far as we know, all the code beyond the following 'if' statement will correctly handle non-HEAPTYPE classes, and the HEAPTYPE check is needed only to protect that subset of non-HEAPTYPE classes for which the interpreter has baked in the assumption that all instances are truly immutable.

Explanation:

CPython stores objects in two ways:

Objects are structures allocated on the heap. Special rules apply to the use of objects to ensure they are properly garbage-collected. Objects are never allocated statically or on the stack; they must be accessed through special macros and functions only. (Type objects are exceptions to the first rule; the standard types are represented by statically initialized type objects, although work on type/class unification for Python 2.2 made it possible to have heap-allocated type objects too).

Information from the comment in Include/object.h.

When you are trying to set a new value to some_obj.__class__, the object_set_class function is called. It is inherited from PyBaseObject_Type, see /* tp_getset */ field. This function checks: can the new type replace the old type in some_obj?

Take your example:

class A:
    pass

class B:
    pass

o = object()
a = A() 
b = B() 

First case:

a.__class__ = B 

The type of a object is A, the heap type, because it is allocated dynamically. As well as the B. The a's type is changed without a problem.

Second case:

o.__class__ = B

The type of o is the built-in type object (PyBaseObject_Type). It is not heap type, so the TypeError is raised:

TypeError: __class__ assignment only supported for heap types or ModuleType subclasses.
like image 85
MiniMax Avatar answered Sep 21 '22 05:09

MiniMax


You can only change __class__ to another type that has the same internal (C) layout. The runtime generally knows that layout only if the type itself is dynamically allocated (a “heap type”), so with one exception explained in MiniMax’s answer that’s a necessary condition that excludes the built-in types as source or destination. You also have to have the same set of __slots__ with the same names.

like image 40
Davis Herring Avatar answered Sep 21 '22 05:09

Davis Herring