In our company we're using subversion. We have different python modules (own and third party) in different versions in use. The various applications we develop have various dependencies regarding the version of the shared modules.
One possibility is using virtualenv
installing the modules from a local pypi server. So on every initial checkout we need to create a virtualenv, activate it and install dependent modules from requirements.txt.
Disadvantages:
So we came up with another solution and I ask for your opinion: In the path of the application we use svn:externals (aka git submodules) to "link" to the specified module (from it's release path and with specified revision number to keep it read only), so the module will be placed locally in the path of the application. A "import mylib" will work as it was installed in python site-packages or in the virtualenv. This could be extended to even put a release of wx, numpy and other often used libraries into our repository and link them locally.
The advantages are:
The actual question is: Are there projects out there on github/sorceforge using this scheme? Why is everybody using virtualenv instead of this (seemingly) simpler scheme? I never saw such a solution, so maybe we miss a point?
PS: I posted this already on pypa-dev mailinglist but it seems to be the wrong place for this kind of question. Please excuse this cross post.
Git submodules may look powerful or cool upfront, but for all the reasons above it is a bad idea to share code using submodules, especially when the code changes frequently. It will be much worse when you have more and more developers working on the same repos.
A Python package is nothing but a collection of modules along with a __init__.py file. The modules can also be arranged in hierarchy of folders inside a package. Just by adding an empty __init__.py file to the in the folder, Python knows it is a Package.
In most cases, Git submodules are used when your project becomes more complex, and while your project depends on the main Git repository, you might want to keep their change history separate. Using the above as an example, the Room repository depends on the House repository, but they operate separately.
Git submodules allow you to keep a git repository as a subdirectory of another git repository. Git submodules are simply a reference to another repository at a particular snapshot in time. Git submodules enable a Git repository to incorporate and track version history of external code.
In the path of the application we use svn:externals (aka git submodules) to "link" to the specified module (from it's release path and with specified revision number to keep it read only), so the module will be placed locally in the path of the application.
This is a more traditional method for managing package dependencies, and is the simpler of the two options for software which is only used internally. With regards to...
After the initial checkout you're ready to run
...that's not strictly true. If one of your dependencies is a Python library written in C, it will need to be compiled first.
We tried it with git's submodule functionality but it's not possible to get a subpath of a repository (like
/source/lib
)
This is fairly easy to work around if you check out the whole repository in a location outside your PYTHONPATH
, then just symlink to the required files or directories inside your PYTHONPATH
, although it does require you to be using a filesystem which support symlinks.
For example, with a layout like...
myproject
|- bin
| |- myprogram.py
|
|- lib
| |- mymodule.py
| |- mypackage
| | |- __init__.py
| |
| |- foopackage -> ../submodules/libfoo/lib/foopackage
| |- barmodule
| |- __init__.py -> ../../submodules/libbar/lib/barmodule.py
|
|- submodules
|- libfoo
| |- bin
| |- lib
| |- foopackage
| |- __init__.py
|
|- libbar
|- bin
|- lib
| barmodule.py
...you need only have my_project/lib
in your PYTHONPATH
, and everything should import correctly.
Are there projects out there on github/sourceforge using this scheme?
The submodule information is just stored in a file called .gitmodules
, and a quick Google for "site:github.com .gitmodules" returns quite a few results.
Why is everybody using virtualenv instead of this (seemingly) simpler scheme?
For packages published on PyPI, and installed with pip
, it's arguably easier from a dependency-management point-of-view.
If your software has a relatively simple dependency graph, like...
myproject
|- libfoo
|- libbar
...it's no big deal, but when it becomes more like...
myproject
|- libfoo
| |- libsubfoo
| |- libsubsubfoo
| |- libsubsubsubfoo
| |- libsubsubsubsubfoo
|- libbar
|- libsubbar1
|- libsubbar2
|- libsubbar3
|- libsubbar4
...you may not want to take on the responsibility of working out which versions of all those sub-packages are compatible, should you need to upgrade libbar
for whatever reason. You can delegate that responsibility to the maintainer of the libbar
package.
In your particular case, the decision as to whether your solution is the right one will depend on the answers to the questions:-
svn
repositories?svn:externals
correctly to include compatible versions of any dependencies they require, or if not, are you prepared to take on the responsibility of managing those dependencies yourself?If the answer to both questions is "yes", then your solution is probably right for your case.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With