Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Watching generation lists during a program run

The Story:

During Nina Zakharenko's PyCon talk on Memory Management in Python, she explains the way the generational garbage collection works in Python noting that:

Python maintains a list of every object created as a program is run. Actually, it makes 3:

  • generation 0
  • generation 1
  • generation 2

The question:

To gain more understanding in Memory Management in Python and for the purpose of debugging memory leaks, how can I observe/watch what objects are added and removed from all the 3 generation lists during a program run?

I've looked through the gc module, but have not found a relevant method to get the current generation lists values.

like image 787
alecxe Avatar asked Jun 09 '16 23:06

alecxe


1 Answers

As we discussed in the comments, I don't think there is a way to access the generation lists directly from python, you can set some debug flags, in python2 you can use the following to report objects that can or cannot be collected:

import gc

gc.set_debug(gc.DEBUG_UNCOLLECTABLE | gc.DEBUG_COLLECTABLE | gc.DEBUG_OBJECTS )

In python3, using the following will give you some generation output and info on collectable and uncollectable objects:

import gc

gc.set_debug(gc.DEBUG_UNCOLLECTABLE | gc.DEBUG_COLLECTABLE  | gc.DEBUG_STATS)

You get output like:

gc: collecting generation 2...
gc: objects in each generation: 265 4454 0
gc: collectable <function 0x7fad67f77b70>
gc: collectable <tuple 0x7fad67f6f710>
gc: collectable <dict 0x7fad67f0e3c8>
gc: collectable <type 0x285db78>
gc: collectable <getset_descriptor 0x7fad67f095e8>
gc: collectable <getset_descriptor 0x7fad67f09630>
gc: collectable <tuple 0x7fad67f05b88>
gc: done, 7 unreachable, 0 uncollectable, 0.0028s elapsed.
gc: collecting generation 2...

For leaks as per the gc.DEBUG_SAVEALL when set, all unreachable objects found will be appended to garbage rather than being freed. This can be useful for debugging a leaking program:

import gc

gc.set_debug(gc.DEBUG_SAVEALL)

In python3, you can also append a callback that is run when the gc starts and finishes, a simple example:

def f(phase, info):
    if phase == "start":
        print("starting garbage collection....")
    else:
        print("Finished garbage collection.... \n{}".format("".join(["{}: {}\n".format(*tup) for tup in info.items()])))

        print("Unreachable objects: \n{}".format(
            "\n".join([str(garb) for garb in gc.garbage])))
        print()


 gc.callbacks.append(f)

Combining gc.DEBUG_SAVEALL with the function will show you any unreachable objects, not much different to setting the DEBUG_COLLECTABLE or DEBUG_LEAK but one example of adding a callback.

like image 176
Padraic Cunningham Avatar answered Nov 11 '22 10:11

Padraic Cunningham