I've seen in different code bases and just read on PyMOTW (see the first Note here).
The explanation says that a cycle will be created in case the traceback is assigned to a variable from sys.exc_info()[2]
, but why is that?
How big of a problem is this? Should I search for all uses of exc_info
in my code base and make sure the traceback is deleted?
Python 3 (update to original answer):
In Python 3, the advice quoted in the question has been removed from the Python documentation. My original answer (which follows) applies only to versions of Python that include the quote in their documentation.
Python 2:
The Python garbage collector will, eventually, find and delete circular references like the one created by referring to a traceback stack from inside one of the stack frames themselves, so don't go back and rewrite your code. But, going forward, you could follow the advice of
http://docs.python.org/library/sys.html
(where it documents exc_info()
) and say:
exctype, value = sys.exc_info()[:2]
when you need to grab the exception.
Two more thoughts:
First, why are you running exc_info()
at all?
If you want to catch an exception shouldn't you just say:
try:
...
except Exception as e: # or "Exception, e" in old Pythons
... do with with e ...
instead of mucking about with objects inside the sys
module?
Second: Okay, I've given a lot of advice but haven't really answered your question. :-)
Why is a cycle created? Well, in simple cases, a cycle is created when an object refers to itself:
a = [1,2,3]
a.append(a)
Or when two objects refer to each other:
a = [1,2,3]
b = [4,5,a]
a.append(b)
In both of these cases, when the function ends the variable values will still exist because they're locked in a reference-count embrace: neither can go away until the other has gone away first! Only the modern Python garbage collector can resolve this, by eventually noticing the loop and breaking it.
And so the key to understanding this situation is that a "traceback" object — the third thing (at index #2) returned by exc_info()
— contains a "stack frame" for each function that was active when the exception was called. And those stack frames are not "dead" objects showing what was true when the execption was called; the frames are still alive! The function that's caught the exception is still alive, so its stack frame is a living thing, still growing and losing variable references as its code executes to handle the exception (and do whatever else it does as it finishes the "except" clause and goes about its work).
So when you say t = sys.exc_info()[2]
, one of those stack frames inside of the traceback — the frame, in fact, belonging to the very function that's currently running — now has a variable in it named t
that points back to the stack frame itself, creating a loop just like the ones that I showed above.
The traceback contains references to all the active frames, which in turn contain references to all the local variables in those various frames -- those references are a big part of the very job of traceback and frame objects, so that's hardly surprising. So, if you add a reference back to the traceback (or fail to remove it promptly having temporarily added it), you inevitably form a big loop of references -- which interferes with garbage collection (and may stop it altogether if any of the objects in the loop belong to classes that overide __del__
, the finalizer method).
Especially in a long-running program, interfering with garbage collection is not the best of idea, because you'll be holding on to memory you don't really need (for longer than necessary, or indefinitely if you've essentially blocked garbage collection on such loops by having them include objects with finalizers).
So, it's definitely best to get rid of tracebacks as soon as feasible, whether they come from exc_info
or not!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With