Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Differences between setuptools and pip's dependency resolution

I've recently begun packaging my first project with SetupTools, and have mostly been successful.

Unfortunately, I've run into a confusing situation - my project depends on a single-file module which isn't available on PyPI. I've been able to configure setup.py to depend on that module easily, using the dependency_links option, and everything works... so long as I'm using setup.py to install it. If I try to use pip to install the project egg, it fails while trying to install the module, assuming that it must be a pre-made egg archive. In comparison, setup.py detects that it's a simple source file and generates an egg from that.

My aim is to have my project available on PyPI, so it's important that it be installable using just pip; so my question is... am I doing something wrong?

My understanding was that setuptools is essentially a means to an end, that end being pip and PyPI, so it seems very strange to me that the two tools should behave so differently.

The relevant part of setup.py and output from each tool follows:

setup(
    name='particle-fish',
    version='0.1.0',
    description='Python Boilerplate contains all the boilerplate you need to create a Python package.',
    long_description=readme + '\n\n' + history,
    author='Lachlan Pease',
    author_email='[email protected]',
    url='https://github.com/predakanga/particle-fish',
    packages=[
        'particle.plugins'
    ],
    include_package_data=True,
    install_requires=['particle', 'irccrypt', 'pycrypto'],
    dependency_links=['http://www.bjrn.se/code/irccrypt/irccrypt.py#egg=irccrypt-1.0'],
    license="BSD",
    zip_safe=False,
    keywords='particle-fish',
    classifiers=[
        'Development Status :: 2 - Pre-Alpha',
        'Intended Audience :: Developers',
        'License :: OSI Approved :: BSD License',
        'Natural Language :: English',
        "Programming Language :: Python :: 2",
        'Programming Language :: Python :: 2.6',
        'Programming Language :: Python :: 2.7',
        'Programming Language :: Python :: 3',
        'Programming Language :: Python :: 3.3',
    ],
    test_suite='tests',
    tests_require=['pytest', 'mock', 'coverage', 'pytest-cov'],
    cmdclass = {'test': PyTest},
)

Output from setup.py install:

Installed /Users/lachlan/.virtualenvs/particle-fish/lib/python2.7/site-packages/particle_fish-0.1.0-py2.7.egg
Processing dependencies for particle-fish==0.1.0
Searching for irccrypt
Best match: irccrypt 1.0
Downloading http://www.bjrn.se/code/irccrypt/irccrypt.py#egg=irccrypt-1.0
Processing irccrypt.py
Writing /var/tmp/easy_install-svPfHF/setup.cfg
Running setup.py -q bdist_egg --dist-dir /var/tmp/easy_install-svPfHF/egg-dist-tmp-Xq3OCt
zip_safe flag not set; analyzing archive contents...
Adding irccrypt 1.0 to easy-install.pth file

Output from pip install:

Downloading/unpacking irccrypt (from particle-fish==0.1.0)
  Downloading irccrypt.py
  Cannot unpack file /private/var/tmp/pip-mCc6La-unpack/irccrypt.py (downloaded from /Users/lachlan/.virtualenvs/particle-staging/build/irccrypt, content-type: text/plain); cannot detect archive format
Cleaning up...
Cannot determine archive format of /Users/lachlan/.virtualenvs/particle-staging/build/irccrypt
like image 570
Lachlan Pease Avatar asked Oct 04 '22 02:10

Lachlan Pease


1 Answers

There are two parts to your question:

  1. What is the correct way to handle dependency_links specified through setup?
  2. Why is this not being handled consistently?

What is the correct way to handle dependency_links specified through setup?

The setuptools documentation says this for the dependency_links argument to the setup menthod:

dependency_links
A list of strings naming URLs to be searched when satisfying dependencies. These links will be used if needed to install packages specified by setup_requires or tests_require. They will also be written into the egg’s metadata for use by tools like EasyInstall to use when installing an .egg file.

Now, this is pretty vague. They do mention EasyInstall, so we can dig into the source code there to determine how it handles dependency_links. Setuptools also documents the internal structure of eggs, which includes the dependency-links.txt file that corresponds to the dependency_links argument for setup. It gives us slightly more insight into what the links are actually supposed to be:

A list of dependency URLs, one per line, as specified using the dependency_links keyword to setup(). These may be direct download URLs, or the URLs of web pages containing direct download links, and will be used by EasyInstall to find dependencies, as though the user had manually provided them via the --find-links command line option. Please see the setuptools manual and EasyInstall manual for more information on specifying this option, and for information on how EasyInstall processes --find-links URLs.

(bold emphasis mine)

This is useful, because it points out how they are supposed to be handled, as well as what it is actually expecting. This description is slightly more useful than the dependency_links one, as it specifies what "URLS to be searched" actually means: direct download links or indexes containing them. The source code confirms that --find-links and dependency_links are actually merged when building the egg, so we can now look at the --find-links argument to EasyInstall for insight on what it is expecting.

--find-links=URLS_OR_FILENAMES, -f URLS_OR_FILENAMES
Scan the specified “download pages” or directories for direct links to eggs or other distributions.

There is more detailed information there, but the general idea is that --find-links is looking for eggs or archives that contain the packages that are not found. So there's the first part of the answer to your mystery: dependency_links is for pointing to full packages, not individual, unpackaged modules.

This sounds like how pip is processing the links you are including, which makes sense when you read through everything. You can confirm this by looking at the tests that pip has. pip has two tests that use dependency_links (1, 2), both of which are assuming that dependency_links is pointing to an indexing page. This seems to fall in line with the description from setuptools, where dependency_links is "a list of strings naming URLs to be searched".


So, now that we know that pip is handling this according to the specification, that leaves the second question:

Why was this working with setuptools?

In order to understand why this was working with setuptools, you need to understand how setuptools determines the dependencies and tries to download/install them. The PackageIndex.download method is called on any urls that are external and require the packages to be downloaded locally to install. The docstring for this function contains the answer to our question:

If it is the URL of a .py file with an unambiguous #egg=name-version tag (i.e., one that escapes - as _ throughout), a trivial setup.py is automatically created alongside the downloaded file.

This explains why setuptools was ignoring the fact that it wasn't a distribution, but pip was failing.


TL;DR: dependency_links should point to a package or archive containing one, not the unpackaged module. Setuptools realizes your pain and helps you out, but this is mostly undocumented. Consider repackaging the module and including it with your package if the license allows.

like image 151
Kevin Brown-Silva Avatar answered Oct 25 '22 19:10

Kevin Brown-Silva