Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python import path: packages with the same name in different folders

Tags:

I am developing several Python projects for several customers at the same time. A simplified version of my project folder structure looks something like this:

/path/   to/     projects/       cust1/         proj1/           pack1/             __init__.py             mod1.py         proj2/           pack2/             __init__.py             mod2.py       cust2/         proj3/           pack3/             __init__.py             mod3.py 

When I for example want to use functionality from proj1, I extend sys.path by /path/to/projects/cust1/proj1 (e.g. by setting PYTHONPATH or adding a .pth file to the site_packages folder or even modifying sys.path directly) and then import the module like this:

>>> from pack1.mod1 import something 

As I work on more projects, it happens that different projects have identical package names:

/path/   to/     projects/       cust3/         proj4/           pack1/    <-- same package name as in cust1/proj1 above             __init__.py             mod4.py 

If I now simply extend sys.path by /path/to/projects/cust3/proj4, I still can import from proj1, but not from proj4:

>>> from pack1.mod1 import something >>> from pack1.mod4 import something_else ImportError: No module named mod4 

I think the reason why the second import fails is that Python only searches the first folder in sys.path where it finds a pack1 package and gives up if it does not find the mod4 module in there. I've asked about this in an earlier question, see import python modules with the same name, but the internal details are still unclear to me.

Anyway, the obvious solution is to add another layer of namespace qualification by turning project directories into super packages: Add __init__.py files to each proj* folder and remove these folders from the lines by which sys.path is extended, e.g.

$ export PYTHONPATH=/path/to/projects/cust1:/path/to/projects/cust3 $ touch /path/to/projects/cust1/proj1/__init__.py $ touch /path/to/projects/cust3/proj4/__init__.py $ python >>> from proj1.pack1.mod1 import something >>> from proj4.pack1.mod4 import something_else 

Now I am running into a situation where different projects for different customers have the same name, e.g.

/path/   to/     projects/       cust3/         proj1/    <-- same project name as for cust1 above           __init__.py           pack4/             __init__.py             mod4.py 

Trying to import from mod4 does not work anymore for the same reason as before:

>>> from proj1.pack4.mod4 import yet_something_else ImportError: No module named pack4.mod4 

Following the same approach that solved this problem before, I would add yet another package / namespace layer and turn customer folders into super super packages.

However, this clashes with other requirements I have to my project folder structure, e.g.

  • Development / Release structure to maintain several code lines
  • other kinds of source code like e.g. JavaScript, SQL, etc.
  • other files than source files like e.g. documents or data.

A less simplified, more real-world depiction of some project folders looks like this:

/path/   to/     projects/       cust1/         proj1/           Development/             code/               javascript/                 ...               python/                 pack1/                   __init__.py                   mod1.py             doc/               ...           Release/             ...         proj2/           Development/             code/               python/                 pack2/                   __init__.py                   mod2.py 

I don't see how I can satisfy the requirements the python interpreter has to a folder structure and the ones that I have at the same time. Maybe I could create an extra folder structure with some symbolic links and use that in sys.path, but looking at the effort I'm already making, I have a feeling that there is something fundamentally wrong with my entire approach. On a sidenote, I also have a hard time believing that python really restricts me in my choice of source code folder names as it seems to do in the case depicted.

How can I set up my project folders and sys.path so I can import from all projects in a consistent manner if there are project and packages with identical names ?

like image 796
ssc Avatar asked Jan 20 '12 04:01

ssc


People also ask

How can I import modules if file is not in same directory?

We can use sys. path to add the path of the new different folder (the folder from where we want to import the modules) to the system path so that Python can also look for the module in that directory if it doesn't find the module in its current directory. As sys.

Can we have more than one class with same name in different package in Python?

This is not possible with the pip. All of the packages on PyPI have unique names. Packages often require and depend on each other, and assume the name will not change. Even if you manage to put the code on Python path, when importing a module, python searches the paths in sys.

What must a directory have in order to be an importable Python package?

The __init__.py files are required to make Python treat directories containing the file as packages.

Can two classes have same name in Python?

yes, if you define a class with the same name as an already existing class, it will override the definition. BUT existing instances of the first class will still behave as usual.


1 Answers

This is the solution to my problem, albeit it might not be obvious at first.

In my projects, I have now introduced a convention of one namespace per customer. In every customer folder (cust1, cust2, etc.), there is an __init__.py file with this code:

import pkgutil __path__ = pkgutil.extend_path(__path__, __name__) 

All the other __init__.py files in my packages are empty (mostly because I haven't had the time yet to find out what else to do with them).

As explained here, extend_path makes sure Python is aware there is more than one sub-package within a package, physically located elsewhere and - from what I understand - the interpreter then does not stop searching after it fails to find a module under the first package path it encounters in sys.path, but searches all paths in __path__.

I can now access all code in a consistent manner criss-cross between all projects, e.g.

from cust1.proj1.pack1.mod1 import something from cust3.proj4.pack1.mod4 import something_else from cust3.proj1.pack4.mod4 import yet_something_else 

On a downside, I had to create an even deeper project folder structure:

/path/   to/     projects/       cust1/         proj1/           Development/             code/               python/                 cust1/                   __init__.py   <--- contains code as described above                   proj1/                     __init__.py <--- empty                     pack1/                     __init__.py <--- empty                     mod1.py 

but that seems very acceptable to me, especially considering how little effort I need to make to maintain this convention. sys.path is extended by /path/to/projects/cust1/proj1/Development/code/python for this project.

On a sidenote, I noticed that of all the __init__.py files for the same customer, the one in the path that appears first in sys.path is executed, no matter from which project I import something.

like image 195
ssc Avatar answered Oct 25 '22 20:10

ssc