Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does __del__() interfere with garbage collection?

I read an example in David Beazley's Python Essential Reference:

class Account(object):
    def __init__(self,name,balance):
         self.name = name
         self.balance = balance
         self.observers = set()
    def __del__(self):
         for ob in self.observers:
             ob.close()
         del self.observers
    def register(self,observer):
        self.observers.add(observer)
    def unregister(self,observer):
        self.observers.remove(observer)
    def notify(self):
        for ob in self.observers:
            ob.update()
    def withdraw(self,amt):
        self.balance -= amt
        self.notify()

class AccountObserver(object):
     def __init__(self, theaccount):
         self.theaccount = theaccount
         theaccount.register(self)
     def __del__(self):
         self.theaccount.unregister(self)
         del self.theaccount
     def update(self):
         print("Balance is %0.2f" % self.theaccount.balance)
     def close(self):
         print("Account no longer in use")

# Example setup
a = Account('Dave',1000.00)
a_ob = AccountObserver(a)

It is mentioned that

... the classes have created a reference cycle in which the reference count never drops to 0 and there is no cleanup. Not only that, the garbage collector (the gc module) won’t even clean it up, resulting in a permanent memory leak.

Can someone explain how this happens? How can weakreferences help here?

like image 996
Sajeev Rajput Avatar asked Jan 25 '17 11:01

Sajeev Rajput


1 Answers

Account().observers is a set referencing AccountObserver() instances, but AccountObserver().theaccount is a reference pointing back to the Account() instance where the observer is stored in the set. That's a circular reference.

Normally, the garbage collector will detect such circles and break the cycle, allowing the reference counts to drop to 0 and normal clean-up to take place. There is an exception, however, for classes that define a __del__ method, as the classes in David's example do. From the Python 2 gc module documenation:

gc.garbage
list of objects which the collector found to be unreachable but could not be freed (uncollectable objects). By default, this list contains only objects with __del__() methods. Objects that have __del__() methods and are part of a reference cycle cause the entire reference cycle to be uncollectable, including objects not necessarily in the cycle but reachable only from it. Python doesn’t collect such cycles automatically because, in general, it isn’t possible for Python to guess a safe order in which to run the __del__() methods.

So the circle can't be broken because the garbage collector refuses to guess what finaliser (__del__ method) to call first. Note that picking one at random is not safe for the specific example; if you call Account().__del__ first, then the observers set is deleted and subsequent calls to AccountObserver().__del__ will fail with an AttributeError.

A weak reference doesn't participate in the reference count; so if AccountObserver().theaccount used a weak reference to point to the corresponding Account() instance instead, then Account() instance would not be kept alive if only weak references were left:

class AccountObserver(object):
    def __init__(self, theaccount):
        self.theaccountref = weakref.ref(theaccount)
        theaccount.register(self)

    def __del__(self):
        theaccount = self.theaccountref()
        if theaccount is not None:    
            theaccount.unregister(self)

    def update(self):
        theaccount = self.theaccountref()
        print("Balance is %0.2f" % theaccount.balance)

    def close(self):
        print("Account no longer in use")

Note that I link to the Python 2 documentation. As of Python 3.4, this is no longer true and even circular dependencies as show in the example will be cleared, as PEP 442 – Safe object finalization has been implemented:

The primary benefits of this PEP regard objects with finalizers, such as objects with a __del__ method and generators with a finally block. Those objects can now be reclaimed when they are part of a reference cycle.

Not that this won't lead to a traceback; if you execute the example in Python 3.6, delete the references, and kick off a garbage collection run, you get a traceback as the Account().observers set is likely to have been deleted:

>>> import gc
>>> del a, a_ob
>>> gc.collect()
Account no longer in use
Exception ignored in: <bound method AccountObserver.__del__ of <__main__.AccountObserver object at 0x10e36a898>>
Traceback (most recent call last):
  File "<stdin>", line 6, in __del__
  File "<stdin>", line 13, in unregister
AttributeError: 'Account' object has no attribute 'observers'
65

The traceback is just a warning otherwise, the gc.collect() call succeeded and the zombie AccountObserver() and Account() objects are reaped anyway.

like image 168
Martijn Pieters Avatar answered Nov 15 '22 14:11

Martijn Pieters