Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does the value of the name parameter to setuptools.setup affect the results?

I recently received a bundle of Python code, written by a graduate student at an academic lab, and consisting of a Python script and about half dozen single-file Python modules, used by by the script. All these files (script and modules) are on the same directory.

I wanted to use pip to install this code in a virtual environment, so I tried my hand at writing a setup.py file for it, something I had not done before.

I got this installation to work, and I have a vague understanding of what most of the stuff I put in the setup.py means.

The one exception to this is the value to the name keyword to the setuptools.setup function.

According to the documentation I found, this parameter is supposed to be the "name of the package", but I this doesn't tell me how its value ultimately matters. In other words, is this value important only to human readers, or does it actually affect either the way pip install, or the code this command installs, will work?

Therefore, I had no idea what value to give to this parameter, and so I just came up with a reasonably-sounding name, but without any attempt to have it match something else in the code base. To my surprise, nothing broke! By this I mean that the pip installation completed without errors, and the installed code performed correctly in the virtual environment.

I experimented a bit, and it seems that pretty much any value I came up was equally OK.

For the sake of the following description, suppose I give the name parameter the value whatever. Then, the only effect this has, as far as I can tell, is that a subdirectory with the name whatever.egg-info/ gets created (by pip?) in the same directory as the setup.py file, and this subdirectory contains two files that include the string whatever in them.

One of these files is whatever.egg-info/PKG-INFO, which contains the line

Name: whatever

The other one is whatever.egg-info/SOURCES.txt, which lists several relative paths, including some beginning with whatever.egg-info/.


Maybe this was too simple a packaging problem for the value of name to matter?

Q: Can someone give me a simple example in which a wrong value for setuptools.setup's name parameter would cause either pip install or the installed code to fail?

like image 528
kjo Avatar asked Jul 07 '20 22:07

kjo


People also ask

What are the required parameters for Setuptools setup?

(Since setuptools.setup()calls distutils.core.setup(), the same parameters are Required: name, version, and at least one of authoror maintainer) convert_2to3_doctests: List of doctest source files that need to be converted with 2to3. See Supporting both Python 2 and Python 3 with Setuptools for more details. dependency_links:

What is Setuptools in Python?

Python setuptools is a module that deals with the Python package development process. It is a library that is designed to allow packing Python projects in a simplified manner. What’s the difference between Python setuptools Vs. distutils? What is the difference between easy_install and setup.py (pip package creation) ? This module is external.

What is the cmdclass parameter in Setuptools?

) The value of the cmdclass parameter should be a dictionary whose keys are the names of the setuptools commands we’re customizing ( ‘install’ in our case), while the corresponding values are our custom command classes we have defined eariler ( CustomInstallCommand in this example).

How to use Setuptools with pyproject?

For basic use of setuptools, you will need a pyproject.toml with the exact following info, which declares you want to use setuptools to package your project: Then, you will need a setup.cfg or setup.py to specify your package information, such as metadata, contents, dependencies, etc. Here we demonstrate the minimum


3 Answers

Preamble: The Python glossary defines a package as "a Python module which can contain submodules or recursively, subpackages". What setuptools and the like create is usually referred to as a distribution which can bundle one or more packages (hence the parameter setup(packages=...)). I will use this meaning for the terms package and distribution in the following text.


The name parameter determines how your distribution will be identified throughout the Python ecosystem. It is not related to the actual layout of the distribution (i.e. its packages) nor to any modules defined within those packages.

The documentation precisely specifies what makes a legal distribution name:

The name of the distribution. The name field is the primary identifier for a distribution. A valid name consists only of ASCII letters and numbers, period, underscore and hyphen. It must start and end with a letter or number. Distribution names are limited to those which match the following regex (run with re.IGNORECASE): ^([A-Z0-9]|[A-Z0-9][A-Z0-9._-]*[A-Z0-9])$.

(History: This specification was refined in PEP 566 to be aligned with the definition according to PEP 508. Before PEP 345 loosely specified distribution names without imposing any restrictions.)

In addition to the above limitations there are some other aspects to consider:

  • When you intend to distribute your distribution via PyPI then no distinction is made between _ and -, i.e. hello_world and hello-world are considered to be the same distribution. You also need to make sure that the distribution name is not already taken on PyPI because otherwise you won't be able to upload it (if it's occupied by an abandoned project, you can attempt to claim ownership of that project in order to be able to use the name; see PEP 541 for more information).
  • Most importantly you should make sure that the distribution name is unique within your working environment, i.e. that it doesn't conflict with other distributions' names. Suppose you have already installed the requests project in your virtual environment and you decide to name your distribution requests as well. Then installing your distribution will remove the already existing installation (i.e. the corresponding package) and you won't be able to access it anymore.

Top-level package names

The second bullet point above also applies to the names of the top-level packages in your distribution. Suppose you have the following distribution layout:

.
├── setup.py
└── testpkg
    └── __init__.py
    └── a.py

The setup.py contains:

from setuptools import setup

setup(
    name='dist-a',
    version='1.0',
    packages=['testpkg'],
)

__init__.py and a.py are just empty files. After installing that distribution you can access it by importing testpkg (the top-level package).

Now suppose that you have a different distribution with name='dist-b' but using the same packages=['testpkg'] and providing a module b.py (instead of a.py). What happens is that the second install is performed over the already existing one, i.e. using the same physical directory (namely testpkg which happens to be the package used by both distributions), possibly replacing already existing modules, though both distributions are actually installed:

$ pip freeze | grep dist-*
dist-a @ file:///tmp/test-a
dist-b @ file:///tmp/test-b
$ python
>>> import testpkg
>>> import testpkg.a
>>> import testpkg.b

Now uninstalling the first distribution (dist-a) will also remove the contents of the second:

$ pip uninstall dist-a
$ python
>>> import testpkg
ModuleNotFoundError: No module named 'testpkg'

Hence besides the distribution name it's also important to make sure that its top-level packages don't conflict with the ones of already installed projects. It's those top-level packages that serve as namespaces for the distribution. For that reason it's a good idea to choose a distribution name which resembles the name of the top-level package - often these are chosen to be the same.

like image 62
a_guest Avatar answered Oct 07 '22 09:10

a_guest


The name is basically metadata which does not directly affect your code, unless you are pulling in the metadata, or building it into an exe with something like PyInstaller.

And as jdaz's answer points out, PyPI name collisions are a consideration, but only if you are planning to upload/distribute your code on PyPI. The setuptools utilities work just as well for managing Python packaging for local distributions through Git, network shares, or even thumb drives. Or, just for private projects you are never planning to distribute.

Note that the my_project.egg-info folder is chock full of other meta-data, such as description and versioning. For example, you can store your current version info in the PKG-INFO file and use:

  1. setuptools (oldschool)
  2. pbr (a setuptools plugin that works well with Git - more info in this Q&A)
  3. or other tools, such as the newer built-in importlib.metadata package (see Python 3.8 docs)

to access that version info programmatically from within your script (as a string, tuple, etc.)

Other metadata such as description, package requirements, etc., is also available, and while the Python Package User Guide and other tutorials typically highlight metadata that directly fills in the info needed to upload to PyPI, if you aren't planning to publically distribute, feel free to fill in what you want and ignore the rest (or roll your own).

like image 39
LightCC Avatar answered Oct 07 '22 08:10

LightCC


From the Python packaging tutorial:

  • name is the distribution name of your package. This can be any name as long as [it] only contains letters, numbers, _ , and -. It also must not already be taken on pypi.org.

(Emphasis added)

This name therefore is the name of the package on PyPI and is the argument for pip install. It is independent of, and not used by, any of your actual package code.

If you used whatever as the name and uploaded it to PyPI, then any user in the world could type pip install whatever to install your package, and they could get details at https://pypi.org/project/whatever/ (which, in fact, is already taken!).

EDIT:

When you run setup.py sdist bdist_wheel, you will end up with a tar.gz source archive and a whl file with the name you provided in setuptools.setup. You can then use these to install your package locally or distribute them however else you wish, outside of PyPI.

Even locally, though, package names must be unique to avoid conflicts. If you try to install two packages with the same name and same version number, you will get a Requirement already satisfied message and pip will exit. If the version numbers do not match, the existing package will be uninstalled and the new package will replace it.

like image 4
jdaz Avatar answered Oct 07 '22 08:10

jdaz