This is an attempt to better understand how reference count works in Python.
Let's create a class and instantiate it. The instance's reference count would be 1
(getrefcount
displays 2
because it's own internal structures reference that class instance increasing reference count by 1
):
>>> from sys import getrefcount as grc
>>> class A():
def __init__(self):
self.x = 100000
>>> a = A()
>>> grc(a)
2
a
's internal variable x
has 2
references:
>>> grc(a.x)
3
I expected it to be referenced by a
and by A
's __init__
method. Then I decided to check.
So I created a temporary variable b
in the __main__
namespace just to be able to access the variable x
. It increased the ref-number by 1
for it to become 3
(as expected):
>>> b = a.x
>>> grc(a.x)
4
Then I deleted the class instance and the ref count decreased by 1
:
>>> del a
>>> grc(b)
3
So now there are 2
references: one is by b
and one is by A
(as I expected).
By deleting A
from __main__
namespace I expect the count to decrease by 1
again.
>>> del A
>>> grc(b)
3
But it doesn't happen. There is no class A
or its instances that may reference 100000
, but still it's referenced by something other than b
in __main__
namespace.
So, my question is, what is 100000
referenced by apart from b
?
BrenBarn suggested that I should use object()
instead of a number which may be stored somewhere internally.
>>> class A():
def __init__(self):
self.x = object()
>>> a = A()
>>> b = a.x
>>> grc(a.x)
3
>>> del a
>>> grc(b)
2
After deleting the instance a
there were only one reference by b
which is very logical.
The only thing that is left to be understood is why it's not that way with number 100000
.
a.x
is the integer 10000. This constant is referenced by the code object corresponding to the __init__()
method of A
. Code objects always include references to all literal constants in the code:
>>> def f(): return 10000
>>> f.__code__.co_consts
(None, 10000)
The line
del A
only deletes the name A
and decreases the reference count of A
. In Python 3.x (but not in 2.x), classes often include some cyclic references, and hence are only garbage collected when you explicitly run the garbage collector. And indeed, using
import gc
gc.collect()
after del A
does lead to the reduction of the reference count of b
.
It's likely that this is an artifact of your using an integer as your test value. Python sometimes stores integer objects for later re-use, because they are immutable. When I run your code using self.x = object()
instead (which will always create a brand-new object for x) I do get grc(b)==2
at the end.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With