Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I compare version numbers in Python?

I am walking a directory that contains eggs to add those eggs to the sys.path. If there are two versions of the same .egg in the directory, I want to add only the latest one.

I have a regular expression r"^(?P<eggName>\w+)-(?P<eggVersion>[\d\.]+)-.+\.egg$ to extract the name and version from the filename. The problem is comparing the version number, which is a string like 2.3.1.

Since I'm comparing strings, 2 sorts above 10, but that's not correct for versions.

>>> "2.3.1" > "10.1.1" True 

I could do some splitting, parsing, casting to int, etc., and I would eventually get a workaround. But this is Python, not Java. Is there an elegant way to compare version strings?

like image 401
BorrajaX Avatar asked Aug 09 '12 16:08

BorrajaX


People also ask

How do I compare version numbers?

To compare version numbers, compare their revisions in left-to-right order. Revisions are compared using their integer value ignoring any leading zeros. This means that revisions 1 and 001 are considered equal. If a version number does not specify a revision at an index, then treat the revision as 0 .

How do you compare variable values in Python?

The == operator compares the value or equality of two objects, whereas the Python is operator checks whether two variables point to the same object in memory. In the vast majority of cases, this means you should use the equality operators == and != , except when you're comparing to None .


2 Answers

Use packaging.version.parse.

>>> from packaging import version >>> version.parse("2.3.1") < version.parse("10.1.2") True >>> version.parse("1.3.a4") < version.parse("10.1.2") True >>> isinstance(version.parse("1.3.a4"), version.Version) True >>> isinstance(version.parse("1.3.xy123"), version.LegacyVersion) True >>> version.Version("1.3.xy123") Traceback (most recent call last): ... packaging.version.InvalidVersion: Invalid version: '1.3.xy123' 

packaging.version.parse is a third-party utility but is used by setuptools (so you probably already have it installed) and is conformant to the current PEP 440; it will return a packaging.version.Version if the version is compliant and a packaging.version.LegacyVersion if not. The latter will always sort before valid versions.

Note: packaging has recently been vendored into setuptools.


An ancient and now deprecated method you might encounter is distutils.version, it's undocumented and conforms only to the superseded PEP 386;

>>> from distutils.version import LooseVersion, StrictVersion >>> LooseVersion("2.3.1") < LooseVersion("10.1.2") True >>> StrictVersion("2.3.1") < StrictVersion("10.1.2") True >>> StrictVersion("1.3.a4") Traceback (most recent call last): ... ValueError: invalid version number '1.3.a4' 

As you can see it sees valid PEP 440 versions as “not strict” and therefore doesn’t match modern Python’s notion of what a valid version is.

As distutils.version is undocumented, here are the relevant docstrings.

like image 196
ecatmur Avatar answered Sep 21 '22 11:09

ecatmur


The packaging library contains utilities for working with versions and other packaging-related functionality. This implements PEP 0440 -- Version Identification and is also able to parse versions that don't follow the PEP. It is used by pip, and other common Python tools to provide version parsing and comparison.

$ pip install packaging 
from packaging.version import parse as parse_version version = parse_version('1.0.3.dev') 

This was split off from the original code in setuptools and pkg_resources to provide a more lightweight and faster package.


Before the packaging library existed, this functionality was (and can still be) found in pkg_resources, a package provided by setuptools. However, this is no longer preferred as setuptools is no longer guaranteed to be installed (other packaging tools exist), and pkg_resources ironically uses quite a lot of resources when imported. However, all the docs and discussion are still relevant.

From the parse_version() docs:

Parsed a project's version string as defined by PEP 440. The returned value will be an object that represents the version. These objects may be compared to each other and sorted. The sorting algorithm is as defined by PEP 440 with the addition that any version which is not a valid PEP 440 version will be considered less than any valid PEP 440 version and the invalid versions will continue sorting using the original algorithm.

The "original algorithm" referenced was defined in older versions of the docs, before PEP 440 existed.

Semantically, the format is a rough cross between distutils' StrictVersion and LooseVersion classes; if you give it versions that would work with StrictVersion, then they will compare the same way. Otherwise, comparisons are more like a "smarter" form of LooseVersion. It is possible to create pathological version coding schemes that will fool this parser, but they should be very rare in practice.

The documentation provides some examples:

If you want to be certain that your chosen numbering scheme works the way you think it will, you can use the pkg_resources.parse_version() function to compare different version numbers:

>>> from pkg_resources import parse_version >>> parse_version('1.9.a.dev') == parse_version('1.9a0dev') True >>> parse_version('2.1-rc2') < parse_version('2.1') True >>> parse_version('0.6a9dev-r41475') < parse_version('0.6a9') True 
like image 40
davidism Avatar answered Sep 22 '22 11:09

davidism