Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does the from ... import ... statement contain an implicit import?

Given a package:

package/
├── __init__.py
└── module.py

__init__.py:

from .module import function

module.py:

def function():
    pass

One can import the package and print its namespace.

python -c 'import package; print(dir(package))'
['__builtins__', ..., 'function', 'module']

Question:

Why does the namespace of package contain module when only function was imported in the __init__.py?

I would have expected that the package's namespace would only contain function and not module. This mechanism is also mentioned in the Documentation,

"When a submodule is loaded using any mechanism (e.g. importlib APIs, the import or import-from statements, or built-in __import__()) a binding is placed in the parent module’s namespace to the submodule object."

but is not really motivated. For me this choice seems odd, as I think of sub-modules as implementation detail to structure packages and do not expect them to be part of the API as the structure can change.

Also I know "Python is for consenting adults" and one cannot truly hide anything from a user. But I would argue, that binding the sub-modules names to the package's scopes makes it less obvious to a user what is actually part of the API and what can change.

Why no use a __sub_modules__ attribute or so to make sub-modules accessible to a user? What is the reason for this design decision?

like image 514
felixinho Avatar asked Jan 08 '20 12:01

felixinho


2 Answers

You say you think of submodules as implementation details. This is not the design intent behind submodules; they can be, and extremely commonly are, part of the public interface of a package. The import system was designed to facilitate access to submodules, not to prevent access.

Loading a submodule places a binding into the parent's namespace because this is necessary for access to the module. For example, after the following code:

import package.submodule

the expression package.submodule must evaluate to the module object for the submodule. package evaluates to the module object for the package, so this module object must have a submodule attribute referring to the module object for the submodule.

At this point, you are almost certainly thinking, "hey, there's no reason from .submodule import function has to do the same thing!" It does the same thing because this attribute binding is part of submodule initialization, which only happens on the first import, and which needs to do the same setup regardless of what kind of import triggered it.

This is not an extremely strong reason. With enough changes and rejiggering, the import system definitely could have been designed the way you expect. It was not designed that way because the designers had different priorities than you. Python's design cares very little about hiding things or supporting any notion of privacy.

like image 193
user2357112 supports Monica Avatar answered Oct 22 '22 04:10

user2357112 supports Monica


you have to understand that Python is a runtime language. def, class and import are all executable statements, that will, when executed, create (respectively) a function, class or module object and bind them in the current namespace.

wrt/ modules (packages being modules too - at least at runtime), the very first time a module is imported (directly or indirectly) for a given process, the matching .py (well, usually it's compiled .pyc version) is executed (all statements at the top level are executed in order), and the resulting namespace will be used to populate the module instance. Only once this has been done can any name defined in the module be accessed (you cannot access something that doesn't exist yet, can you ?). Then the module object is cached in sys.modules for subsequent imports. In this process, a when a sub-module is loaded, it's considered as an attribute of it's parent module.

For me this choice seems odd, as I think of sub-modules as implementation detail to structure packages and do not expect them to be part of the API as the structure can change

Actually, Python's designers considered things the other way round: a "package" (note that there's no 'package' type at runtime) is mostly a convenience to organize a collection of related modules - IOW, the ̀moduleis the real building block - and as a matter of fact, at runtime, when what you import is technically a "package", it still materializes as amodule` object.

Now wrt/ the "do not expect them to be part of the API as the structure can change", this has of course been taken into account. It's actually a quite common pattern to start out with a single module, and then turn it into a package as the code base grows - without impacting client code, of course. The key here is to make proper use of your package's initializer - the __init__.py file - which is actually what your package's module instance is built from. This lets the package act as a "facade", masking the "implementation details" of which submodule effectively defines which function, class or whatever.

So the solution here is plain simply to, in your package's __init__.py, 1/ import the names you want to make public (so the client code can import directly from your package instead of having to go thru the submodule) and 2/ define the __all__ attributes with the names that should be considered public so the interface is clearly documented.

FWIW, this last operation should be done for all your submodules too, and you can also use the _single_leading_underscore naming convention for things that are really really "implementation details".

None of this will of course prevent anyone to import even "private" names directly from your submodules, but then they are on their own when something breaks ("we are all consenting adults" etc).

like image 1
bruno desthuilliers Avatar answered Oct 22 '22 04:10

bruno desthuilliers