Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should I pin my Python dependencies versions?

I am about to release a Python library I've been working on the past few weeks. I've read a lot about Python dependencies but something is not quite clear yet:

Some people pretend you should never pin your dependencies versions as it would prevent the users of your library from upgrading those dependencies.

Some other claim that you should always pin your dependencies versions as it is the only way of guaranteeing that your release works the way it did when you developed it and to prevent that a breaking change in a dependency wreaks havoc in your library.

I somehow went for an hybrid solution, where I assumed my dependencies used semantic versioning and pinned only the major version number (say somelib >= 2.3.0, < 3) except when the major version number is 0 (semantic versioning dictates that such versions are to be considered volatile and may break the API even if only the patch number is bumped).

As of now, I'm not sure which way is the best. Is there an official guideline (even a PEP perhaps ?) that dictates the best practice regarding Python dependencies and how to specify them ?

like image 554
ereOn Avatar asked Feb 13 '15 21:02

ereOn


People also ask

Should I pin dependencies?

Ranges for Libraries In this case, it is usually a bad idea to pin all your dependencies because it will introduce an unnecessarily narrow range (one release!) and cause most users of your package to bloat their node_modules with duplicates. For example, you might have pinned foobar to version 1.1.

How do you prefer to manage Python dependencies?

Using venv and pipenv are two methods of managing dependencies in Python. They are simple to implement and, for most users, adequate solutions for handling multiple projects with different dependencies. However, they are not the only solutions. Other services can complement their use.

What are Python dependencies?

Dependencies are all of the software components required by your project in order for it to work as intended and avoid runtime errors. You can count on PyPI (the Python Package Index) to provide packages that can help you get started on everything from data manipulation to machine learning to web development, and more.

Where does Python store dependencies?

Dependencies are installed separately from system-level packages to prevent library version conflicts. The most common isolation method is virtualenv. Each virtualenv is its own copy of the Python interpreter and dependencies in the site-packages directory.


2 Answers

Pinning can be problematic and lead to security risks. Especially for a library, as in your case, it can lead to more dependency conflicts if it will typically be used in combination with other PyPI packages which themselves will have dependencies.

Why? A detailed study of Python Dependency Resolution, after analyzing tens of thousands of PyPI packages and their current rates of dependency conflicts, discusses this issue. It explains that:

if the distribution is not installed into its own, empty, single-purpose environment, then the likelihood of dependency conflicts is substantially increased if dependency versions are all pinned instead of leaving the ranges flexible.

and notes that pinning can exacerbate security problems by interfering with upgrading.

It advises:

If a project pins dependencies, then it must be prepared to issue a new release every time there is an important release of anything the project depends on directly or indirectly, all the way down the dependency chain.

like image 40
nealmcb Avatar answered Sep 21 '22 13:09

nealmcb


The reason the two other answers contradict each other is that they're both right (and worth reading), but they apply to different situations.

If you're releasing a library on PyPI, you should declare whatever dependencies you know about, but not pin to a specific version. For example, if you know you need >= 1.2, but 1.4 is broken, then you can write something like somepkg >= 1.2, != 1.4. If one of the things you know is that somepkg follows SemVer, then you can add a < 2.

If you're building something like a web app that you deploy yourself, then you should pin all of your exact dependencies, and use a service like pyup.io or requires.io to notify you when new versions are released. This way you can stay up to date, while making sure that the versions you deploy are the same as the versions you tested against.

Notice that these two pieces of advice complement each other: it's just a fact that if app A that uses library B, then either the author of A or the author of B can pin B's dependencies, but not both. So we have to pick one. The underlying principle here is that this is best done as late as possible, i.e., by the author of A, who can see their whole system; the job of library B is to pass on some useful hints to help A make these decisions. In particular, if all of the libraries that A depends on record exactly what they know about their underlying dependencies, then it's possible for A to make sensible decisions about what to do when they overlap. Like if dependency B depends on requests >= 1.0, != 1.2, and dependency C depends on requests >= 1.1, then we can guess that 1.1 or 1.3 might be good versions to pin. If dependency B depends on requests == 1.1 and dependency C depends on requests == 1.2, then we're just stuck.

like image 137
Nathaniel J. Smith Avatar answered Sep 18 '22 13:09

Nathaniel J. Smith