Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Weird namespace pollution when importing submodule in a package's __init__.py

Tags:

python

main.py:

    import package

package/__init__.py:

    # use function to split local and global namespace
    def do_import():
        print globals().keys()
        print locals().keys()

        import foo as mod

        print locals().keys()
        print globals().keys()

    do_import()

package/foo.py:

    print 'Hello from foo'

Execute main.py will output like this:

['__builtins__', '__file__', '__package__', '__path__', '__name__', 'do_import', '__doc__']
[]
Hello from foo
['mod']
['__builtins__', '__file__', '__package__', '__path__', '__name__', 'foo', 'do_import', '__doc__']

The import in __init__.py didn't work as expected. Notice that the global namespace has a 'foo' which should bind to local 'mod' only

Even a exec "import foo as mod" in {'__name__': __name__, '__path__': __path__} cannot stop global namespace from being modified

How could this happen?

like image 781
youfu Avatar asked Sep 24 '12 02:09

youfu


People also ask

Why can't I import modules in Python?

This is caused by the fact that the version of Python you're running your script with is not configured to search for modules where you've installed them. This happens when you use the wrong installation of pip to install packages.

Why is __ init __ py module used in Python?

The __init__.py files are required to make Python treat directories containing the file as packages. This prevents directories with a common name, such as string , unintentionally hiding valid modules that occur later on the module search path.

What should init py contain?

The __init__.py file can contain the same Python code that any other module can contain, and Python will add some additional attributes to the module when it is imported.

What is inside __ init __ py?

The __init__.py file indicates that the files in a folder are part of a Python package. Without an __init__.py file, you cannot import files from another directory in a Python project.


1 Answers

Ah! Tricky, but I got it!

"foo" is not a simple "other package" - it is seem by Python as a sub-module of your "package" module.

When you first run "package" - either importing it from an external script, or by running it with the -m command line switch (but not if you run python package/__init__.py directly from the command line), the "package" module is parsed, and added to the sys.modules dicticionary (on the sys module).

When the sub-module foo is read, besides being placed directly under sys.modules under the key ["package.foo"], it is also set as an attribute to its parent module. Therefore it would be avaliable in your Python app as package.foo. What happens is that setting an attribute in sys.modules["package"], has the same effect than setting a key in package/__init__.py globals in runtime. That is what is happening.

I hope I could translate the process into words properly - if not, just ask again by commenting.

-- Since this is probably happening in real code you have, and the equivalent of "do_import" is being called from code outside your package (and have the side effects of making your sub-modules appear on the package's global namespace), there is no easy work around on the way you are doing it. My suggestion is to just add an underscore (_) at the beggining of the sub-modules names if they are not intended to be called from general code from outside your package. (It also won't show up if someone does from package import * in this case)

like image 118
jsbueno Avatar answered Oct 29 '22 23:10

jsbueno