I have created a Python package which builds on the structure indicated in Kenneth Reitz' "Repository Structure and Python" (1). The main package path is:
/projects-folder (not site-packages)
/package
/package
__init__.py
Datasets.py
Draw.py
Gmaps.py
ShapeSVG.py
project.py
__init__.py
setup.py
With the current structure, I must use the following module import syntax:
import package.package.Datasets
I would prefer to type the following:
import package.Datasets
I am capable of typing the same word twice, of course, but it feels wrong in a deeper sense, i.e., I am structuring my package incorrectly or misunderstanding how Python interprets that structure.
The outer __init__.py is required for Python to detect this package at all, per the docs (2). But that sets up /package/ as the top level of the package and /package/package/ as a sub-package, forcing me into the unwieldy import syntax above.
To avoid this, it seems that my options are to:
PYTHONPATH environment variable.Yet both of these seem like suboptimal workarounds for something that shouldn't be an issue in the first place. What should I do?
You've misunderstood. You have two package packages for some reason, but the source you cite never said to do that. The outer folder, with setup.py, is not supposed to be a package.
It sounds like you're running Python in projects-folder and trying to import your package from there. That's not what you should be doing. You have several options to get your package into the import system. (I'll refer to the folder with setup.py in it as setupfolder, to distinguish it from the inner folder):
setup.py, for example, python setup.py bdist-wheel --universal, and install the built package with pip.pip install path/to/setupfolder. Building the package produces an installer useful if you want to distribute your package, but maybe you don't want to do that.pip install -e path/to/setupfolder, so the Python import system will locate the package's source tree when performing imports. This is handy because you don't have to rebuild and reinstall if you edit the source repository, although you'll still want to restart any running Python processes that are using the package.setupfolder.Any of these options will cause your package to be importable directly as package instead of package.package, as it should be.
While I do not entirely agree with your package structure, you can make use of __all__ and possibly the one legitimate use for star imports I've seen so far. __init__.py can serve more purposes than just identifying your folder as a package or sub-package.
Using a Star Import
In package/package/__init__.py, add a variable __all__ that declares all the public elements you want to export:
__all__ = ['Datasets', 'Draw', 'Gmaps', 'ShapeSVG', 'project']
In package/__init__.py do from package.package import *. Now all the attributes that were available as package.package.x will also be available as package.x.
If you want to directly copy package.package.__all__ to package.__all__ (which is optional, but will allow you to do from package import * properly), you can do something like
from package.package import *
from package.package import __all__ as _all
__all__ = _all
del _all
Not Using a Star Import
You can accomplish the same thing without using package.package.__all__ at all. Just add __all__ directly to package/__init__.py and use from package.package import x-style imports:
from package.package import (
Datasets, Draw, Gmaps, ShapeSVG, project
)
# As before, package.__all__ is optional
__all__ = ['Datasets', 'Gmaps', 'ShapeSVG', 'project']
I would still recommend having a package.package.__all__ variable, but it is optional for this particular purpose.
Pros and Cons
Both approaches are pretty legitimate and I have seen both used in major projects. The first approach reduces redundancy. You only define the public exports in one place: package.package.__all__. The star imports and package.__all__ reference that definition directly, leading to one place that you really have to maintain. On the other hand, there are times when you want to separate the "full" package.package.x API from what you expose via package.x to the casual user. In that case, go with the second option. The only downside here is that you have to be more careful to keep package.__all__ and the corresponding imports synchronized properly.
Note
A number of projects I've seen (numpy especially comes to mind), export the attributes of the individual modules to the top level using this technique. For example, if you had a function package.package.Datasets.get_data, it would be listed in package.package.Datasets.__all__, which would be imported into pacakge.package.__init__, appended to package.package.__all__, and then be referenced by the top-level package and package.__all__.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With