Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where is Python's shutdown procedure setting module globals to None documented?

CPython has a strange behaviour where it sets modules to None during shutdown. This screws up error logging during shutdown of some multithreading code I've written.

I can't find any documentation of this behaviour. It's mentioned in passing in PEP 432:

[...] significantly reducing the number of modules that will experience the "module globals set to None" behaviour that is used to deliberate break cycles and attempt to releases more external resources cleanly.

There are SO questions about this behaviour and the C API documentation mentions shutdown behaviour for embedded interpreters.

I've also found a related thread on python-dev and a related CPython bug:

This patch does not change the behavior of module objects clearing their globals dictionary as soon as they are deallocated.

Where is this behaviour documented? Is it Python 2 specific?

like image 594
Wilfred Hughes Avatar asked Sep 03 '14 16:09

Wilfred Hughes


People also ask

What is __ loader __ in Python?

__loader__ is an attribute that is set on an imported module by its loader. Accessing it should return the loader object itself. In Python versions before 3.3, __loader__ was not set by the built-in import machinery. Instead, this attribute was only available on modules that were imported using a custom loader.

What is __ all __ in Python?

Python __all__ It's a list of public objects of that module, as interpreted by import * . It overrides the default of hiding everything that begins with an underscore.


1 Answers

The behaviour is not well documented, and is present in all versions of Python from about 1.5-ish until Python 3.4:

As part of this change, module globals are no longer forcibly set to None during interpreter shutdown in most cases, instead relying on the normal operation of the cyclic garbage collector.

The only documentation for the behaviour is the moduleobject.c source code:

/* To make the execution order of destructors for global    objects a bit more predictable, we first zap all objects    whose name starts with a single underscore, before we clear    the entire dictionary.  We zap them by replacing them with    None, rather than deleting them from the dictionary, to    avoid rehashing the dictionary (to some extent). */ 

Note that setting the values to None is an optimisation; the alternative would be to delete names from the mapping, which would lead to different errors (NameError exceptions rather than AttributeErrors when trying to use globals from a __del__ handler).

As you found out on the mailinglist, the behaviour predates the cyclic garbage collector; it was added in 1998, while the cyclic garbage collector was added in 2000. Since function objects always reference the module __dict__ all function objects in a module involve circular references, which is why the __dict__ needed clearing before GC came into play.

It was kept in place even when cyclic GC was added, because there might be objects with __del__ methods involved in cycles. These aren't otherwise garbage-collectable, and cleaning out the module dictionary would at least remove the module __dict__ from such cycles. Not doing that would keep all referenced globals of that module alive.

The changes made for PEP 442 now make it possible for the garbage collector to clear cyclic references with objects that provide a __del__ finalizer, removing the need to clear the module __dict__ for most cases. The code is still there but this is only triggered if the __dict__ attribute is still alive even after moving the contents of sys.modules to weak references and starting a GC collection run when the interpreter is shutting down; the module finalizer simply decrements their reference count.

like image 183
Martijn Pieters Avatar answered Oct 06 '22 04:10

Martijn Pieters