In PEP 366 - Main module explicit relative imports which introduced the module-scope variable <code>__package__</code> to allow explicit relative imports in submodules, there is the following excerpt: <blockquote> When the main module is specified by its filename, then the <code>__package__</code> attribute will be set to <code>None</code>. To allow relative imports when the module is executed directly, boilerplate similar to the following would be needed before the first relative import statement: <pre class="prettyprint lang-py prettyprint-override"><code>if __name__ == "__main__" and __package__ is None: __package__ = "expected.package.name" </code></pre> Note that this boilerplate is sufficient only if the top level package is already accessible via <code>sys.path</code>. Additional code that manipulates <code>sys.path</code> would be needed in order for direct execution to work without the top level package already being importable. This approach also has the same disadvantage as the use of absolute imports of sibling modules - if the script is moved to a different package or subpackage, the boilerplate will need to be updated manually. It has the advantage that this change need only be made once per file, regardless of the number of relative imports. </blockquote> I have tried to use this boilerplate in the following setting: <ul> <li> Directory layout: <pre class="prettyprint lang-sh prettyprint-override"><code>foo ├── bar.py └── baz.py </code></pre> </li> <li> Contents of the bar.py submodule: <pre class="prettyprint lang-py prettyprint-override"><code>if __name__ == "__main__" and __package__ is None: __package__ = "foo" from . import baz </code></pre> </li> </ul> The boilerplate works when executing the submodule bar.py from the file system (the <code>PYTHONPATH</code> modification makes the package foo/ accessible on <code>sys.path</code>): <pre class="prettyprint lang-sh prettyprint-override"><code>PYTHONPATH=$(pwd) python3 foo/bar.py </code></pre> The boilerplate also works when executing the submodule bar.py from the module namespace: <pre class="prettyprint lang-sh prettyprint-override"><code>python3 -m foo.bar </code></pre> However the following alternative boilerplate works just as well in both cases as the contents of the bar.py submodule: <pre class="prettyprint lang-py prettyprint-override"><code>if __package__: from . import baz else: import baz </code></pre> Furthermore this alternative boilerplate is simpler and does not require any update of the submodule bar.py when it is moved with the submodule baz.py to a different package (since it does not hard code the package name <code>"foo"</code>). So here are my questions about the boilerplate of PEP 366: <ol> <li>Is the first subexpression <code>__name__ == "__main__"</code> necessary or is it already implied by the second subexpression <code>__package__ is None</code>?</li> <li>Shouldn’t the second subexpression <code>__package__ is None</code> be <code>not __package__</code> instead, in order to handle the case where <code>__package__</code> is the empty string (like in a <code>__main__.py</code> submodule executed from the file system by supplying the containing directory: <code>PYTHONPATH=$(pwd) python3 foo/</code>)?</li> </ol>

The correct boilerplate is none, just write the explicit relative import and let the exception escape if someone tries to run the module as a script or has <code>sys.path</code> misconfigured: <pre class="prettyprint"><code>from . import baz </code></pre> The boilerplate given in PEP 366 is just there to show that the proposed change is sufficient to allow users to make direct execution* work if they really want to, it isn’t intended to suggest that making direct execution work is a good idea (it isn’t, it is a bad idea that will almost inevitably cause other problems, even with the boilerplate from the PEP). Your proposed alternative boilerplate recreates the problem caused by implicit relative imports in Python 2: the <code>"baz"</code> module gets imported as <code>baz</code> from <code>__main__</code>, but will be imported as <code>"foo.baz"</code> everywhere else, so you end up with two copies in <code>sys.modules</code> under different names. Amongst other problems, this means that if some other module throws <code>foo.baz.SomeException</code> and your <code>__main__</code> module tries to catch <code>baz.SomeException</code>, it won’t work, as those will be two different exception objects coming from two different modules. By contrast, if you use the PEP boilerplate, then <code>__main__</code> will correctly import <code>baz</code> as <code>"foo.baz"</code>, and the only thing you have to worry about is other modules potentially importing <code>foo.bar</code>. If you want simpler boilerplate that explicitly guards against the "inadvertently making two copies of the same module under a different name" bug without hardcoding the package name, then you can use this: <pre class="prettyprint"><code>if not __package__: raise RuntimeError(f"{__file__} must be imported as a package submodule") </code></pre> However, if you are going to do that, you can just as well do <code>from . import baz</code> unconditionally as suggested above, and let the underlying exception escape if someone tries to run the script directly instead of via the <code>-m</code> switch. <hr> * Direct execution means executing code from: <ol> <li>A file path argument except directory and zip file paths (<code>python <file path></code>).</li> <li>A <code>-c</code> argument (<code>python -c <code></code>).</li> <li>The interactive interpreter (<code>python</code>).</li> <li>Standard input (<code>python < <file path></code>).</li> </ol> Indirect execution means executing code from: <ol start="5"> <li>A directory or zip file path argument (<code>python <directory or zip file path></code>).</li> <li>A <code>-m</code> argument (<code>python -m <module name></code>).</li> <li>An import statement (<code>import <module name></code>)</li> </ol> <hr> Now to answer your questions specifically: <blockquote> <ol> <li>Is the first subexpression <code>__name__ == "__main__"</code> necessary or is it already implied by the second subexpression <code>__package__ is None</code>?</li> </ol> </blockquote> It is hard to get <code>__package__ is None</code> anywhere other than the <code>__main__</code> module with the modern import system. But it used to be a lot more common, as rather than being set by the import system on module load, <code>__package__</code> would instead be set lazily by the first explicit relative import executed in the module. In other words, the boilerplate is only trying to let direct execution work (cases 1 to 4 above) but <code>__package__ is None</code> used to imply direct execution or an import statement (case 7 above), so to filter out case 7 the subexpression <code>__name__ == "__main__"</code> (cases 1 to 6 above) was necessary. <blockquote> <ol start="2"> <li>Shouldn’t the second subexpression <code>__package__ is None</code> be <code>not __package__</code> instead, in order to handle the case where <code>__package__</code> is the empty string (like in a <code>__main__.py</code> submodule executed from the file system by supplying the containing directory: <code>PYTHONPATH=$(pwd) python3 foo/</code>)?</li> </ol> </blockquote> No because the boilerplate is only trying to let direct execution work (cases 1 to 4 above), it isn’t trying to let other flavours of <code>sys.path</code> misconfiguration pass silently.

What is the correct boilerplate for explicit relative imports?

Tags:

python

relative-import

boilerplate

In PEP 366 - Main module explicit relative imports which introduced the module-scope variable __package__ to allow explicit relative imports in submodules, there is the following excerpt:

When the main module is specified by its filename, then the __package__ attribute will be set to None. To allow relative imports when the module is executed directly, boilerplate similar to the following would be needed before the first relative import statement:
if __name__ == "__main__" and __package__ is None:
    __package__ = "expected.package.name"
Note that this boilerplate is sufficient only if the top level package is already accessible via sys.path. Additional code that manipulates sys.path would be needed in order for direct execution to work without the top level package already being importable.

This approach also has the same disadvantage as the use of absolute imports of sibling modules - if the script is moved to a different package or subpackage, the boilerplate will need to be updated manually. It has the advantage that this change need only be made once per file, regardless of the number of relative imports.

I have tried to use this boilerplate in the following setting:

Directory layout:
```
foo
├── bar.py
└── baz.py
```

Contents of the bar.py submodule:

if __name__ == "__main__" and __package__ is None:
    __package__ = "foo"

from . import baz

The boilerplate works when executing the submodule bar.py from the file system (the PYTHONPATH modification makes the package foo/ accessible on sys.path):

PYTHONPATH=$(pwd) python3 foo/bar.py

The boilerplate also works when executing the submodule bar.py from the module namespace:

python3 -m foo.bar

However the following alternative boilerplate works just as well in both cases as the contents of the bar.py submodule:

if __package__:
    from . import baz
else:
    import baz

Furthermore this alternative boilerplate is simpler and does not require any update of the submodule bar.py when it is moved with the submodule baz.py to a different package (since it does not hard code the package name "foo").

So here are my questions about the boilerplate of PEP 366:

Is the first subexpression __name__ == "__main__" necessary or is it already implied by the second subexpression __package__ is None?
Shouldn’t the second subexpression __package__ is None be not __package__ instead, in order to handle the case where __package__ is the empty string (like in a __main__.py submodule executed from the file system by supplying the containing directory: PYTHONPATH=$(pwd) python3 foo/)?

735

asked Sep 15 '20 20:09

Maggyero

1 Answers

The correct boilerplate is none, just write the explicit relative import and let the exception escape if someone tries to run the module as a script or has sys.path misconfigured:

from . import baz

The boilerplate given in PEP 366 is just there to show that the proposed change is sufficient to allow users to make direct execution* work if they really want to, it isn’t intended to suggest that making direct execution work is a good idea (it isn’t, it is a bad idea that will almost inevitably cause other problems, even with the boilerplate from the PEP).

Your proposed alternative boilerplate recreates the problem caused by implicit relative imports in Python 2: the "baz" module gets imported as baz from __main__, but will be imported as "foo.baz" everywhere else, so you end up with two copies in sys.modules under different names.

Amongst other problems, this means that if some other module throws foo.baz.SomeException and your __main__ module tries to catch baz.SomeException, it won’t work, as those will be two different exception objects coming from two different modules.

By contrast, if you use the PEP boilerplate, then __main__ will correctly import baz as "foo.baz", and the only thing you have to worry about is other modules potentially importing foo.bar.

If you want simpler boilerplate that explicitly guards against the "inadvertently making two copies of the same module under a different name" bug without hardcoding the package name, then you can use this:

if not __package__:
    raise RuntimeError(f"{__file__} must be imported as a package submodule")

However, if you are going to do that, you can just as well do from . import baz unconditionally as suggested above, and let the underlying exception escape if someone tries to run the script directly instead of via the -m switch.

* Direct execution means executing code from:

A file path argument except directory and zip file paths (python <file path>).
A -c argument (python -c <code>).
The interactive interpreter (python).
Standard input (python < <file path>).

Indirect execution means executing code from:

A directory or zip file path argument (python <directory or zip file path>).
A -m argument (python -m <module name>).
An import statement (import <module name>)

Now to answer your questions specifically:

Is the first subexpression __name__ == "__main__" necessary or is it already implied by the second subexpression __package__ is None?

It is hard to get __package__ is None anywhere other than the __main__ module with the modern import system. But it used to be a lot more common, as rather than being set by the import system on module load, __package__ would instead be set lazily by the first explicit relative import executed in the module. In other words, the boilerplate is only trying to let direct execution work (cases 1 to 4 above) but __package__ is None used to imply direct execution or an import statement (case 7 above), so to filter out case 7 the subexpression __name__ == "__main__" (cases 1 to 6 above) was necessary.

Shouldn’t the second subexpression __package__ is None be not __package__ instead, in order to handle the case where __package__ is the empty string (like in a __main__.py submodule executed from the file system by supplying the containing directory: PYTHONPATH=$(pwd) python3 foo/)?

No because the boilerplate is only trying to let direct execution work (cases 1 to 4 above), it isn’t trying to let other flavours of sys.path misconfiguration pass silently.

136

answered Sep 19 '22 19:09

ncoghlan

Related questions
                            
                                Get current zoom and center from mapbox in dash
                            
                                Automatically refactor python lambdas to named functions
                            
                                How to fix "WARNING: Hidden import "pygame._view" not found!" when converting .py to .exe using PyInstaller?
                            
                                How can I copy DataFrames with datetimes from Stack Overflow into Python?
                            
                                Can't use Image.putalpha() on a png file from PIL lib. OSError: cannot write mode PA as PNG
                            
                                Write a readable test-case for a diff which includes "\n"
                            
                                Bot only takes one command
                            
                                Python 3.6 type hinting for a function accepting generic class type and instance type of the same generic type
                            
                                How do I make a circular tree with multiple root trees
                            
                                How to implement single sign-on django auth in azure ad?
                            
                                Shift "nan" to the beginning of an array in python [duplicate]
                            
                                To what extent does Google Colab support Python typing?
                            
                                Python Turtle Write Value in Containing Box
                            
                                What form of imports should I use in __main__.py and then how should I run the project?
                            
                                Keras loss and metrics values do not match with same function in each
                            
                                Fill Box Color in Box Plot
                            
                                ERROR: Unable to find py4j, your SPARK_HOME may not be configured correctly
                            
                                TypeError: required field "type_ignores" missing from Module
                            
                                Infinite scroll bar is not working with django
                            
                                Plotting networkx.Graph: how to change node position instead of resetting every node?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With