I've got a highly complex class :
class C:
pass
And I've got this test code :
for j in range(10):
c = C()
print c
Which gives :
<__main__.C instance at 0x7f7336a6cb00>
<__main__.C instance at 0x7f7336a6cab8>
<__main__.C instance at 0x7f7336a6cb00>
<__main__.C instance at 0x7f7336a6cab8>
<__main__.C instance at 0x7f7336a6cb00>
<__main__.C instance at 0x7f7336a6cab8>
<__main__.C instance at 0x7f7336a6cb00>
<__main__.C instance at 0x7f7336a6cab8>
<__main__.C instance at 0x7f7336a6cb00>
<__main__.C instance at 0x7f7336a6cab8>
One can easily see that Python switches on two different values. In some cases, this can be catastrophic (for example if we store the objects in some other complex object).
Now, if I store the objects in a List :
lst = []
for j in range(10):
c = C()
lst.append(c)
print c
I get this :
<__main__.C instance at 0x7fd8f8f7eb00>
<__main__.C instance at 0x7fd8f8f7eab8>
<__main__.C instance at 0x7fd8f8f7eb48>
<__main__.C instance at 0x7fd8f8f7eb90>
<__main__.C instance at 0x7fd8f8f7ebd8>
<__main__.C instance at 0x7fd8f8f7ec20>
<__main__.C instance at 0x7fd8f8f7ec68>
<__main__.C instance at 0x7fd8f8f7ecb0>
<__main__.C instance at 0x7fd8f8f7ecf8>
<__main__.C instance at 0x7fd8f8f7ed40>
Which solves the case.
So now, I have to ask a question... Does anyone could explain with complex words (I mean, deeply) how Python behave with the objects references ? I suppose, it is a matter of optimization (to spare memory, or prevent leaks, ...)
Thank a lot.
EDIT : Ok so, let's be more specific. I'm quite aware that python has to collect garbage sometimes... But, in my case :
I had a list returned by a Cython defined class : class 'Network' that manages a 'Node's list (both Network and Node class are defined in a Cython extension
). Each Node has a an object [then casted into (void *)] 'userdata' object. The Nodes list is populated from inside cython, while the UserData are populated inside the Python script. So in python, I had the following :
...
def some_python_class_method(self):
nodes = self.netBinding.GetNetwork().get_nodes()
...
for item in it:
a_site = PyLabSiteEvent()
#l_site.append(a_site) # WARN : Required to get an instance on 'a_site'
# that persits - workaround...
item.SetUserData(a_site)
Reusing this node list later on in the same python class using the same cython getter :
def some_other_python_class_method(self, node):
s_data = node.GetUserData()
...
So, it seems that with the storage made in the node list's UserDatas, my python script was completely blind and was freeing/reusing memory. It worked by referencing a second time (but apparently a first one for python side), using an additional list (here : 'l_site').
This is why I had to know a bit more about Python itself, but it seems that the way I implemented the communication between Python and Cython
is responsible for the issues a had to face.
There is no need to be "complex" here: In the first example, you keep no other reference to the object referenced by the name "c" - when running the code in the line "c = C()" on subsequent iterations of the loop, the one reference previously held in "c" is lost.
Since standard Python uses reference counting to keep track of when it should delete objects from memory, as at this moment the reference counting for the object of the previous loop interation reaches 0, it is destroyed, and its memory is made available for other objects.
Why do you have 2 changing values? Because at the moment the object in the new iteration is created - i.e. when Python executes the expression to the right side of the =
in c = C()
, the object of the precvious iteration still exists, referenced by the name c
- so the new object is constructed at another memory locaton. Python then proceeds to the assignment of the new object to c
at which point the previous object is destroyed as described above - which means that on the next (3rd) iteration, that memory will be available for a new instance of C
.
On the second example, the newly created objects never loose reference, and therefore their memory is not freed at all - new objects always take up a new memory location.
Most important of all: The purpose of using a high level language such as Python or others, is not having to worry about memory allocation. The language takes care of that to you. In this case, the CPython (standard) implementation does just the right thing, as you noticed. Other implementations such as Pypy or Jython can have completely different behavior in regards to the "memory location" of each instances in the above examples, but all conforming implementatons (including these 3) will behave exactly the same from the "point of view" of the Python program: (1) It does have access to the instances it keeps a reference to, (2) the data of these instances is not corrupted or mangled in anyway.
It doesn't seem complicated.
In the first example, the second time through the loop, the memory at 0x7f7336a6cb00 is occupied by the C instance created during the first iteration. Therefore, Python allocates the next memory block 0x7f7336a6cab8 for the new C object.
However, as soon as you create the second C object and assign it to c
there are no references left to the now-orphaned object at 0x7f7336a6cb00. Therefore, the third time through the loop Python can re-use the memory at this location for the new object. As soon as it does so, of course, the object at 0x7f7336a6cab8 no longer has a reference to it, and that memory location becomes available for recycling the fourth time through the loop.
In your second example, however, by appending the object to the list you are preserving a reference to each object you create. Since these objects always have at least one reference to them, the memory they "live in" is never available to be freed and recycled. Therefore Python allocates new memory each time.
The illusion of danger produced in the first example is just that — an illusion. As long as a reference exists to an object you create, the object will be maintained. When no references exist any more it is safe for Python to free the memory used by the object since it is not possible for your program to make use of the object any more.
During each loop, you change the object c
refers to, so the original is inaccessible, and Python is free to get rid of it (since why keep any object around if you can never access it again?). At that point, it has free memory at that same spot and seems to reuse it. I'm not sure what's surprising about this. If the interpreter didn't ever reuse memory, you'd run out really fast.
This isn't happening when you add the object to a list because the object is still accessible, so Python can't get rid of it (since you might use it again).
This shouldn't ever cause problems, since Python won't get rid of an object while you can still use it, so if you "store the objects in some complex object", they will remain accessible and their memory won't be reused (at least until the object goes away).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With