Avoiding module namespace pollution in Python

Don't worry about it. Just document how to use your package and let consumers of it just ignore the implementation details.

This is just horrible, in my opinion. A well-designed interface should be easily discoverable. Having the implementation details publicly visible makes the interface much more confusing. Even as the author of a package, I don't want to use it when it exposes too much, as it makes autocompletion less useful.

Add an underscore to the beginning of all implementation details.

This is a well-understood convention, and most development tools are smart enough to at least sort underscore-prefixed names to the bottom of autocomplete lists. It works fine if you have a small number of names to treat this way, but as the number of names grows, it becomes more and more tedious and ugly.

Take for example this relatively simple list of imports:

import struct

from abc    import abstractmethod, ABC
from enum   import Enum
from typing import BinaryIO, Dict, Iterator, List, Optional, Type, Union

Applying the underscore technique, this relatively small list of imports becomes this monstrosity:

import struct as _struct

from abc    import abstractmethod as _abstractmethod, ABC as _ABC
from enum   import Enum as _Enum
from typing import (
    BinaryIO as _BinaryIO,
    Dict     as _Dict,
    Iterator as _Iterator,
    List     as _List,
    Optional as _Optional,
    Type     as _Type,
    Union    as _Union
)

Now, I know this problem can be partially mitigated by never doing from imports, and just importing the entire package, and package-qualifying everything. While that does help this situation, and I realize that some people prefer to do this anyway, it doesn't eliminate the problem, and it's not my preference. There are some packages I prefer to import directly, but I usually prefer to import type names and decorators explicitly so that I can use them unqualified.

There's an additional small problem with the underscore prefix. Take the following publicly exposed class:

class Widget(_ABC):
    @_abstractmethod
    def implement_me(self, input: _List[int]) -> _Dict[str, object]:
        ...

A consumer of this package implementing his own Widget implementation will see that he needs to implement the implement_me method, and it needs to take a _List and return a _Dict. Those aren't actual type names, and now the implementation-hiding mechanism has leaked into my public interface. It's not a big problem, but it does contribute to the ugliness of this solution.

Hide the implementation details inside a function.

This one's definitely hacky, and it doesn't play well with most development tools.

Here's an example:

def module():
    import struct

    from abc    import abstractmethod, ABC
    from typing import BinaryIO, Dict, List

    def fill_list(r: BinaryIO, count: int, lst: List[int]) -> None:
        while count > 16:
            lst.extend(struct.unpack("<16i", r.read(16 * 4)))
            count -= 16
        while count > 4:
            lst.extend(struct.unpack("<4i", r.read(4 * 4)))
            count -= 4
        for _ in range(count):
            lst.append(struct.unpack("<i", r.read(4))[0])

    def parse_ints(r: BinaryIO) -> List[int]:
        count = struct.unpack("<i", r.read(4))[0]
        rtn: List[int] = []
        fill_list(r, count, rtn)
        return rtn

    class Widget(ABC):
        @abstractmethod
        def implement_me(self, input: List[int]) -> Dict[str, object]:
            ...

    return (parse_ints, Widget)

parse_ints, Widget = module()
del module

This works, but it's super hacky, and I don't expect it to operate cleanly in all development environments. ptpython, for example, fails to provide method signature information for the parse_ints function. Also, the type of Widget becomes my_package.module.<locals>.Widget instead of my_package.Widget, which is weird and confusing to consumers.

Use `all`.

This is a commonly given solution to this problem: list the "public" members in the global __all__ variable:

import struct

from abc    import abstractmethod, ABC
from typing import BinaryIO, Dict, List

__all__ = ["parse_ints", "Widget"]

def fill_list(r: BinaryIO, count: int, lst: List[int]) -> None:
    ...  # You've seen this.

def parse_ints(r: BinaryIO) -> List[int]:
    ...  # This, too.

class Widget(ABC):
    ...  # And this.

This looks nice and clean, but unfortunately, the only thing __all__ affects is what happens when you use wildcard imports from my_package import *, which most people don't do, anyway.

Convert the module to a subpackage, and expose the public interface in `init.py`.

This is what I'm currently doing, and it's pretty clean for most cases, but it can get ugly if I'm exposing multiple modules instead of flattening everything:

my_package/
+--__init__.py
+--_widget.py
+--shapes/
   +--__init__.py
   +--circle/
   |  +--__init__.py
   |  +--_circle.py
   +--square/
   |  +--__init__.py
   |  +--_square.py
   +--triangle/
      +--__init__.py
      +--_triangle.py

Then my __init__.py files look kind of like this:

# my_package.__init__.py

from my_package._widget.py import parse_ints, Widget

# my_package.shapes.circle.__init__.py

from my_package.shapes.circle._circle.py import Circle, Sphere

# my_package.shapes.square.__init__.py

from my_package.shapes.square._square.py import Square, Cube

# my_package.shapes.triangle.__init__.py

from my_package.shapes.triangle._triangle.py import Triangle, Pyramid

This makes my interface clean, and works well with development tools, but it makes my directory structure pretty messy if my package isn't completely flat.

Is there a better technique?

650

asked Aug 30 '19 14:08

P Daddy

1 Answers

Convert to subpackages to limit the number of classes in a place and to separate concerns. If a class or constant is not needed outside of its module, prefix it with a double underscore. Import the module name if you do not want to explicitly import many classes from it. You have laid out all the solutions.

189

answered Oct 19 '22 00:10

spacether

Related questions
                            
                                PyQt keep aspect ratio fixed
                            
                                How to pin pipenv requirements with brackets?
                            
                                Prevent script dir from being added to sys.path in Python 3
                            
                                How should I type-hint an integer variable that can also be infinite?
                            
                                pandas.read_csv() can apply different date formats within the same column! Is it a known bug? How to fix it?
                            
                                dtreeviz: from graphviz.backend cannot import name 'run'
                            
                                Deploy Django Channels with Docker
                            
                                Different `grad_fn` for similar looking operations in Pytorch (1.0)
                            
                                Cython: Assigning single element to multidimensional memory view slice
                            
                                How to use pandas .replace() with list of regexs while honoring list order?
                            
                                Why is there a difference between round(x) and round(np.float64(x))?
                            
                                Mask RCNN: How to add region annotation based on manually segmented image?
                            
                                Is there a way for me to see how much volume an application is outputting?
                            
                                How to do groupKfold validation and have balanced data?
                            
                                How to extract tweets location which contain specific keyword using twitter API in Python
                            
                                How to use pytest fixture outside test run?
                            
                                RDKit installation under Windows and Python3.7.4
                            
                                Why doesn't this higher-order function pass static type checking in mypy?
                            
                                Unable to install ansible due to python dependency on Ubuntu 18.04

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Avoiding module namespace pollution in Python

Tags:

python

namespaces

package

Don't worry about it. Just document how to use your package and let consumers of it just ignore the implementation details.

Add an underscore to the beginning of all implementation details.

Hide the implementation details inside a function.

Use `all`.

Convert the module to a subpackage, and expose the public interface in `init.py`.

P Daddy

People also ask

1 Answers

spacether

Recent Activity

Donate For Us

Avoiding module namespace pollution in Python

Tags:

python

namespaces

package

Don't worry about it. Just document how to use your package and let consumers of it just ignore the implementation details.

Add an underscore to the beginning of all implementation details.

Hide the implementation details inside a function.

Use __all__.

Convert the module to a subpackage, and expose the public interface in __init__.py.

P Daddy

People also ask

1 Answers

spacether

Related questions

Recent Activity

Donate For Us

Use `all`.

Convert the module to a subpackage, and expose the public interface in `init.py`.