Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I Pytest a project using PEP 420 namespace packages?

Tags:

pytest

I am trying to use Pytest to test a largish (~100k LOC, 1k files) project and there are several other similar projects for which I'd like to do the same, eventually. This is not a standard Python package; it's part of a heavily customized system that I have little power to change, at least in the near term. Test modules are integrated with the code, rather than being in a separate directory, and that is important to us. The configuration is fairly similar to this question, and my answer there that might provide helpful background, too.

The issue I'm having is that the projects use PEP 420 implicit namespace packages almost exclusively; that is, there are almost no __init__.py files in any of the package directories. I haven't seen any cases yet where the packages had to be namespace packages, but given that this project is combined with other projects that also have Python code, this could happen (or already be happening and I've just not noticed it).

Consider a repository that looks like the following. (For a runnable copy of it, including the tests described below, clone 0cjs/pytest-impl-ns-pkg from GitHub.) All tests below are assumed to be in project/thing/thing_test.py.

repo/
    project/
        util/
            thing.py
            thing_test.py

I have enough control over the testing configurations that I can ensure sys.path is set appropriately for imports of the code under test to work properly. That is, the following test will pass:

def test_good_import():
    import project.util.thing

However, Pytest is determining package names from files using its usual system, giving package names that are not the standard ones for my configuration and adding subdirectories of my project to sys.path. So the following two tests fail:

def test_modulename():
    assert 'project.util.thing_test' == __name__
    # Result: AssertionError: assert 'project.util.thing_test' == 'thing_test'

def test_bad_import():
    ''' While we have a `project.util.thing` deep in our hierarchy, we do
        not have a top-level `thing` module, so this import should fail.
    '''
    with raises(ImportError):
        import thing
    # Result: Failed: DID NOT RAISE <class 'ImportError'>

As you can see, while thing.py can always be imported as project.util.thing, thing_test.py is project.util.thing_test outside of Pytest, but in a Pytest run project/util is added to sys.path and the module is named thing_test.

This introduces a number of problems:

  1. Module namespace collisions (e.g., between project/util/thing_test.py and project/otherstuff/thing_test.py).
  2. Bad import statements not being caught because the code under test is also using these non-production import paths.
  3. Relative imports may not work in test code because the module has been "moved" in the hierarchy.
  4. In general I'm quite nervous about having a large number of extra paths added to sys.path in testing that will be absent in production as I see a lot of potential for errors in this. But let's call that the first (and at the moment, I guess, default) option.

What I think I would like to be able to do would be to tell Pytest that it should determine module names relative to specific filesystem paths that I provide, rather than itself deciding what paths to used based on presence and absence of __init__.py files. However, I see no way to do this with Pytest. (It's not out of the question for me to add this to Pytest, but that also won't happen in the near future as I think I'd want a much deeper understanding of Pytest before even proposing exactly how to do this.)

A third option (after just living with the current situation and changing pytest as above) is simply to add dozens of __init__.py files to the project. However, while using extend_path in them would (I think) deal with the namespace vs. regular package issue in the normal Python world, I think it would break our unusual release system for packages declared in multiple projects. (That is, if another project had a project.util.other module and was combined for release with our project, the collision between their project/util/__init__.py and our project/util/__init__.py would be a major problem.) Fixing this would be a major challenge since we'd have to, among other things, add some way to declare that some directories containing an __init__.py are actually namespace packages.

Are there ways to improve the above options? Are there other options I'm missing?

like image 540
cjs Avatar asked May 04 '18 11:05

cjs


People also ask

What are namespace packages in Python?

In Python, a namespace package allows you to spread Python code among several projects. This is useful when you want to release related libraries as separate downloads.

How do you create a namespace in Python?

A namespace gets created automatically when a module or package starts execution. Hence, create namespace in python, all you have to do is call a function/ object or import a module/package.

What is a namespace package Pycharm?

Namespace packages allow you to split the sub-packages and modules within a single package across multiple, separate distribution packages (referred to as distributions in this document to avoid ambiguity). For example, if you have the following package structure: mynamespace/ __init__.py subpackage_a/ __init__.py ...


1 Answers

The issue you are facing is that you place tests aside the production code inside namespace packages. As stated here, pytest recognizes your setup as standalone test modules:

Standalone test modules / conftest.py files

...

pytest will find foo/bar/tests/test_foo.py and realize it is NOT part of a package given that there’s no __init__.py file in the same folder. It will then add root/foo/bar/tests to sys.path in order to import test_foo.py as the module test_foo. The same is done with the conftest.py file by adding root/foo to sys.path to import it as conftest.

So the proper way to solve (at least part of) this would be to adjust the sys.path and separate tests from production code, e.g. moving test module thing_test.py into a separate directory project/util/tests. Since you can't do that, you have no choice but to mess with pytest's internals (as you won't be able to override the module import behaviour via hooks). Here's a proposal: create a repo/conftest.py with the patched LocalPath class:

# repo/conftest.py

import pathlib
import py._path.local


# the original pypkgpath method can't deal with namespace packages,
# considering only dirs with __init__.py as packages
pypkgpath_orig = py._path.local.LocalPath.pypkgpath

# we consider all dirs in repo/ to be namespace packages
rootdir = pathlib.Path(__file__).parent.resolve()
namespace_pkg_dirs = [str(d) for d in rootdir.iterdir() if d.is_dir()]

# patched method
def pypkgpath(self):
    # call original lookup
    pkgpath = pypkgpath_orig(self)
    if pkgpath is not None:
        return pkgpath
    # original lookup failed, check if we are subdir of a namespace package
    # if yes, return the namespace package we belong to
    for parent in self.parts(reverse=True):
        if str(parent) in namespace_pkg_dirs:
            return parent
    return None

# apply patch
py._path.local.LocalPath.pypkgpath = pypkgpath

pytest>=6.0

Version 6 removes usages of py.path so the monkeypatching should be applied to _pytest.pathlib.resolve_package_path instead of LocalPath.pypkgpath, but the rest is essentially the same:

# repo/conftest.py

import pathlib
import _pytest.pathlib


resolve_pkg_path_orig = _pytest.pathlib.resolve_package_path

# we consider all dirs in repo/ to be namespace packages
rootdir = pathlib.Path(__file__).parent.resolve()
namespace_pkg_dirs = [str(d) for d in rootdir.iterdir() if d.is_dir()]

# patched method
def resolve_package_path(path):
    # call original lookup
    result = resolve_pkg_path_orig(path)
    if result is not None:
        return result
    # original lookup failed, check if we are subdir of a namespace package
    # if yes, return the namespace package we belong to
    for parent in path.parents:
        if str(parent) in namespace_pkg_dirs:
            return parent
    return None

# apply patch
_pytest.pathlib.resolve_package_path = resolve_package_path
like image 140
hoefling Avatar answered Sep 22 '22 07:09

hoefling