Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Multiple packages in one repository or one package per repository?

I have a big Python 3.7+ project and I am currently in the process of splitting it into multiple packages that can be installed separately. My initial thought was to have a single Git repository with multiple packages, each with its own setup.py. However, while doing some research on Google, I found people suggesting one repository per package: (e.g., Python - setuptools - working on two dependent packages (in a single repo?)). However, nobody provides a good explanation as to why they prefer such structure.

So, my question are the following:

  • What are the implications of having multiple packages (each with its own setup.py) on the same GitHub repo?
  • Am I going to face issues with such a setup?
  • Are the common Python tools (documentation generators, pypi packaging, etc) compatible with with such a setup?
  • Is there a good reason to prefer one setup over the other?
  • Please keep in mind that this is not an opinion-based question. I want to know if there are any technical issues or problems with any of the two approaches.

Also, I am aware (and please correct me if I am wrong) that setuptools now allow to install dependencies from GitHub repos, even if the GitHub URL of the setup.py is not at the root of the repository.

like image 627
AstrOne Avatar asked Jan 19 '19 10:01

AstrOne


People also ask

How many packages are there for data in Python?

There are more than 200,000 Python packages in the world (and that's just counting those hosted on PyPI, the official Python Package Index).

How do you manage packages in Python?

You can add Python . tar and . gz packages to your environment with a simple pip install command, or else install them using a script. You can also uninstall Python packages using pip, as well.

Does Python allow creating own packages?

To create a package in Python, we need to follow these three simple steps: First, we create a directory and give it a package name, preferably related to its operation. Then we put the classes and the required functions in it.


3 Answers

One aspect is covered here https://pip.readthedocs.io/en/stable/reference/pip_install/#vcs-support

In particular, if setup.py is not in the root directory you have to specify the subdirectory where to find setup.py in the pip install command.

So if your repository layout is:

  • pkg_dir/
    • setup.py # setup.py for package pkg
    • some_module.py
  • other_dir/
    • some_file
    • some_other_file

You’ll need to use pip install -e vcs+protocol://repo_url/#egg=pkg&subdirectory=pkg_dir.

like image 117
Teitur Avatar answered Nov 09 '22 11:11

Teitur


"Best" approach? That's a matter of opinion, which is not the domain of SO. But here are a couple of justifications for creating separate packages:

  1. Package is functionally independent of the other packages in your project.
    That is, doesn't import from them and performs a function that could be useful to other developers. Extra points if the function this package performs is similar to packages already in PyPI. Extra points if the package has a stable API and clear documentation. Penalty points if package is a thin grab bag of unrelated functions that you factored out of multiple packages for ease of maintenance, but the functions don't have an unifying principle.
  2. The package is optional with respect to your main project, so there'd be cases where users could reasonably choose to skip installing it.
    Perhaps one package is a "client" and the other is the "server". Or perhaps the package provides OS-specific capabilities. Note that a package like this is not functionally independent of the main project and so does not qualify under the previous bullet point, but this would still be a good reason to separate it.

I agree with @boriska's point that the "single package" project structure is a maintenance convenience well worth striving for. But not (and this is just my opinion, I'm going to get downvoted for expressing it) at the expense of cluttering up the public package index with a large number of small packages that are never installed separately.

like image 29
BobHy Avatar answered Nov 09 '22 12:11

BobHy


I am researching the same issue myself. PyPa documentation recommends the layout described in 'native' subdirectory of: https://github.com/pypa/sample-namespace-packages

I find the single package structure described below, very useful, see the discussion around testing the 'installed' version. https://blog.ionelmc.ro/2014/05/25/python-packaging/#the-structure I think this can be extended to multiple packages. Will post as I learn more.

like image 3
boriska Avatar answered Nov 09 '22 12:11

boriska