I have a question regarding garbage collection in Python. After reading some insightful articles on why one might prefer to run a Python program with disabled garbage collection*, I decided to search and remove all circular references in my code to allow objects to be destroyed through ref-counting alone.
For finding existing circular references, I put a call to gc.collect() in the tearDown method of my unittest cases and to print out a warning whenever a value >0 was returned. Most of the issues found were easy fixed by refactoring or the use of weak references.
After a while though, I came across a rather curious problem, best expressed in code:
import gc
gc.disable()
def bar():
class Foo( object ):
pass
bar()
print( gc.collect() ) # prints 6
When removing the call to bar(), gc.collect() returns 0, as expected.
It seems like even though Foo is created within the scope of the function bar and never returned to the outside, it sticks around and causes the garbage collector to find unreachable objects.
When moving Foo outside the scope of bar, everything works fine again. That solution is however not applicable to the problem I am trying to solve in the affected code (dynamic creation of ctypes.Structures for serialization).
The following two approaches did not work either:
import gc
gc.disable()
def bar():
type( "Foo", ( object, ), {} )
bar()
print( gc.collect() ) # prints 6 again
or even the very 'clever':
import gc
gc.disable()
import weakref
def bar():
weakref.ref( type( "Foo", ( object, ), {} ) )
bar()
print( gc.collect() ) # still prints 6
To top it off, here's an example that actually works ... but only in Python2:
import gc
gc.disable()
def bar():
class Foo(): # not subclassing object
pass
bar()
print( gc.collect() ) # prints 0 - finally?
The code above however, does again print out "6" in Python3 - I suspect, because all user defined classes are new-style classes in Python3.
So, am I stuck with Python2, weird "unreachable objects" in Python3 or do I have to follow up every call to bar with a manual garbage collection?
*(articles on running Python with gc.disable() )
http://pydev.blogspot.de/2014/03/should-python-garbage-collector-be.html http://dsvensson.wordpress.com/2010/07/23/the-garbage-garbage-collector-of-python/
See roippi's answer for why the above does behave as expected.
For future reference though, here's a small workaround that will fix this particular problem. Not saying that disabling gc is the right thing for anyone to do, but if you feel like it's the right thing for you, this is how I did it:
import gc
gc.disable()
def requiresGC( func ):
def func_wrapper( *args, **kwargs ):
result = func( *args, **kwargs )
gc.collect()
return result
return func_wrapper
@requiresGC
def bar():
class Foo( object ):
pass
bar()
print( gc.collect() ) # prints 0
Note however, that this decorator will cause significant slowdown, if bar() is a function that is called regularly. In my case however (serialization), this is not the case and having the gc-overhead contained to a few specific functions seems a reasonable compromise.
Thanks to everyone who took the time to answer so quickly! :-)
Python garbage collection algorithm is very useful to open up space in the memory. Garbage collection is implemented in Python in two ways: reference counting and generational. When the reference count of an object reaches 0, reference counting garbage collection algorithm cleans up the object immediately.
Yes, Python garbage collector removes every object not referenced to. The feature is based on reference counting. However it can also deal with cyclic references. Of course when the process is terminated, all its resources are released.
Well, GC has its own drawbacks. First, it must run in the background which in CPython not really possible because of GIL, so GC is a stop-the-world process. And second, because GC happens in the background, the exact time frame for object releases is undetermined.
Any time a reference count drops to zero, the object is immediately removed. 295 * deallocated immediately at that time. A full collection is triggered when the number of new objects is greater than 25% of the number of existing objects.
Declaring a new-style class - either statically or via type
- creates a circular reference (actually, more than one). Here's the clearest example I can provide:
class Baz:
pass
print(Baz in Baz.__mro__)
#True
There's a few other circular refs in Baz
's __dict__
too, but one is all you need.
There's not really any workaround I can offer you - this is what the GC is there for, I'm afraid. I can point you to this bug report that's been around for a while if you'd like to dive in further.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With