Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python API Compatibility Checker

In my current work environment, we produce a large number of Python packages for internal use (10s if not 100s). Each package has some dependencies, usually on a mixture of internal and external packages, and some of these dependencies are shared.

As we approach dependency hell, updating dependencies becomes a time consuming process. While we care about the functional changes a new version might introduce, of equal (if not more) importance are the API changes that break the code.

Although running unit/integration tests against newer versions of a dependency helps us to catch some issues, our coverage is not close enough to 100% to make this a robust strategy. Release notes and a change log help identify major changes at a high-level, but these rarely exist for internally developed tools or go into enough detail to understand the implications the new version has on the (public) API.

I am looking at otherways to automate this process.

I would like to be able to automatically compare two versions of a Python package and report the API differences between them. In particular this would include backwards incompatible changes such as removing functions/methods/classes/modules, adding positional arguments to a function/method/class and changing the number of items a function/method returns. As a developer, based on the report this generates I should have a greater understanding about the code level implications this version change will introduce, and so the time require to integrate it.

Elsewhere, we use the C++ abi-compliance-checker and are looking at the Java api-compliance-checker to help with this process. Is there a similar tool available for Python? I have found plenty of lint/analysis/refactor tools but nothing that provides this level of functionality. I understand that Python's dynamic typing will make a comprehensive report impossible.

If such a tool does not exist, are they any libraries that could help with implementing a solution? For example, my current approach would be to use an ast.NodeVisitor to traverse the package and build a tree where each node represents a module/class/method/function and then compare this tree to that of another version for the same package.

Edit: since posting the question I have found pysdiff which covers some of my requirements, but interested to see alternatives still.

Edit: also found Upstream-Tracker would is a good example of the sort of information I'd like to end up with.

like image 373
Mark Streatfield Avatar asked Feb 21 '14 02:02

Mark Streatfield


People also ask

How do I know if a Python package is compatible?

If you are using an older version of Python and need the most recent version of the package that is compatible with that version, you can go to the release history (the second link at the top of the sidebar) and try different versions, scrolling down to the "Meta" section for every version.

How do I know if a Python script is Python 3 compatible?

In python2, the division divide in integers so the result of a division will be different if you don't use the same division scheme. In that case, you'll have to look if the module is importing from __future__ import division to have the same behaviour in python2 and python3.

Is Python 3 forward compatible?

The short answer is "No", the long answer is "They strive for something close to it".

What is ABI Python?

To enable this, Python provides a Stable ABI: a set of symbols that will remain compatible across Python 3. x versions. The Stable ABI contains symbols exposed in the Limited API, but also other ones – for example, functions necessary to support older versions of the Limited API.


2 Answers

What about using the AST module to parse the files?

import ast

with file("test.py") as f:
    python_src = f.read()

    node = ast.parse(python_src) # Note: doesn't compile the src
    print ast.dump(node)

There's the walk method on the ast node (described http://docs.python.org/2/library/ast.html)

The astdump might work (available on pypi)

This out of date pretty printer http://code.activestate.com/recipes/533146-ast-pretty-printer/

The documentation tool Sphinx also extracts the information you are looking for. Perhaps give that a look.

So walk the AST and build a tree with the information you want in it. Once you have a tree you can pickle it and diff later or convert the tree to a text representation in a text file you can diff with difftools, or some external diff program.

The ast has parse() and compile() methods. Only thing is I'm not entirely sure how much information is available to you after parsing (as you don't want to compile()).

like image 63
demented hedgehog Avatar answered Oct 03 '22 00:10

demented hedgehog


Perhaps you can start by using the inspect module

import inspect
import types
def genFunctions(module):
    moduleDict = module.__dict__
    for name in dir(module):
        if name.startswith('_'):
            continue
        element = moduleDict[name]
        if isinstance(element, types.FunctionType):
            argSpec = inspect.getargspec(element)
            argList = argSpec.args
            print "{}.{}({})".format(module.__name__, name, ", ".join(argList))

That will give you a list of "public" (not starting with underscore) functions with their argument lists. You can add more stuff to print the kwargs, classes, etc.

Once you run that on all the packages/modules you care about, in both old and new versions, you'll have two lists like this:

myPackage.myModule.myFunction1(foo, bar)
myPackage.myModule.myFunction2(baz)

Then you can either just sort and diff them, or write some smarter tooling in Python to actually compare all the names, e.g. to permit additional optional arguments but reject new mandatory arguments.

like image 45
John Zwinck Avatar answered Oct 03 '22 01:10

John Zwinck