When parametrizing tests and fixtures in pytest, pytest seem to eagerly evaluate all parameters and to construct some test list datastructure before starting to execute the tests.
This is a problem in 2 situations:
Thus my question: Is it possibly to tell pytest to evaluate the parameters on the fly (i.e. lazily)?
pytest-lazy-fixture lets you use a fixture as one of the values passed in @pytest.mark.parametrize : import pytest from pytest_lazyfixture import lazy_fixture @pytest. fixture def one(): return 1 @pytest. mark. parametrize('arg1,arg2', [ ('val1', lazy_fixture('one')), ]) def test_func(arg1, arg2): assert arg2 == 1.
pytest. fixture() allows one to parametrize fixture functions.
pytest will build a string that is the test ID for each set of values in a parametrized test. These IDs can be used with -k to select specific cases to run, and they will also identify the specific case when one is failing. Running pytest with --collect-only will show the generated IDs.
As for your 2 question - proposed in comment link to manual seems like exactly what one should do. It allows "to setup expensive resources like DB connections or subprocess only when the actual test is run".
But as for 1 question it seems like such feature not implemented. You may directly pass generator to parametrize
like so:
@pytest.mark.parametrize('data', data_gen)
def test_gen(data):
...
But pytest will list()
of your generator -> RAM problems persists here as well.
I've also found some github issues than shed more light about why pytest not handle generator lazily. And it seems like a design problem. So "its not possible to correctly manage parametrization having a generator as value" because of
"pytest would have to collect all those tests with all the metadata... collection happens always before test running".
There are also some refers to hypothesis
or nose's yield-base tests
in such cases. But if you still want to stick to pytest
there are some workarounds:
import pytest
def get_data(N):
for i in range(N):
yield list(range(N))
N = 3000
data_gen = get_data(N)
@pytest.mark.parametrize('ind', range(N))
def test_yield(ind):
data = next(data_gen)
assert data
So here you parametrize over index
(which is not so useful - just indicating pytest number of executions it must made) and generate data inside next run.
You may also wrap it to memory_profiler
:
Results (46.53s):
3000 passed
Filename: run_test.py
Line # Mem usage Increment Line Contents
================================================
5 40.6 MiB 40.6 MiB @profile
6 def to_profile():
7 76.6 MiB 36.1 MiB pytest.main(['test.py'])
And compare with straightforward:
@pytest.mark.parametrize('data', data_gen)
def test_yield(data):
assert data
Which 'eats' much more memory:
Results (48.11s):
3000 passed
Filename: run_test.py
Line # Mem usage Increment Line Contents
================================================
5 40.7 MiB 40.7 MiB @profile
6 def to_profile():
7 409.3 MiB 368.6 MiB pytest.main(['test.py'])
data_gen = get_data(N)
@pytest.fixture(scope='module', params=len_of_gen_if_known)
def fix():
huge_data_chunk = next(data_gen)
return huge_data_chunk
@pytest.mark.parametrize('other_param', ['aaa', 'bbb'])
def test_one(fix, other_param):
data = fix
...
So we use fixture here at module
scope level in order to "preset" our data for parametrized test. Note that right here you may add another test and it will receive generated data as well. Simply add it after test_two:
@pytest.mark.parametrize('param2', [15, 'asdb', 1j])
def test_two(fix, param2):
data = fix
...
NOTE: if you do not know the number of generated data you may use this trick: set some approximate value (better if it be a bit higher than generated tests count) and 'mark' tests passed if it stops with StopIteration
which will happen when all data generated already.
Another possibility is to use Factories as fixtures. Here you embed your generator into fixture and try
yield in your test till it not ends. But here is another disadvantage - pytest will treat it as single test (with possibly bunch of checks inside) and will fail if one of generated data fails. Other words if compare to parametrize approach not all pytest statistic/features may be accessed.
And yet one another is to use pytest.main()
in the loop something like so:
# data_generate
# set_up test
pytest.main(['test'])
@pytest.mark.parametrize("one", list_1)
@pytest.mark.parametrize("two", list_2)
def test_maybe_convert_objects(self, one, two):
...
Change to:
@pytest.mark.parametrize("one", list_1)
def test_maybe_convert_objects(self, one):
for two in list_2:
...
It's similar to factories but even more easy to implement. Also it not only reduce RAM multiple times but time for collecting metainfo as well. Drawbacks here - for pytest it would be one test for all two
values. And it works smoothly with "simple" tests - if one have some special xmark
s inside or something there might be problems.
I've also opened corresponding issue there might appear some additional info/tweaks about this problem.
EDIT: my first reaction would be "that is exactly what parametrized fixtures are for": a function-scoped fixture is a lazy value being called just before the test node is executed, and by parametrizing the fixture you can predefine as many variants (for example from a database key listing) as you like.
from pytest_cases import fixture_plus
@fixture_plus
def db():
return <todo>
@fixture_plus
@pytest.mark.parametrize("key", [<list_of keys>])
def sample(db, key):
return db.get(key)
def test_foo(sample):
return sample
That being said, in some (rare) situations you still need lazy values in a parametrize function, and you do not wish these to be the variants of a parametrized fixture. For those situations, there is now a solution also in pytest-cases
, with lazy_value
. With it, you can use functions in the parameter values, and these functions get called only when the test at hand is executed.
Here is an example showing two coding styles (switch the use_partial boolean arg to True to enable the other alternative)
from functools import partial
from random import random
import pytest
from pytest_cases import lazy_value
database = [random() for i in range(10)]
def get_param(i):
return database[i]
def make_param_getter(i, use_partial=False):
if use_partial:
return partial(get_param, i)
else:
def _get_param():
return database[i]
return _get_param
many_lazy_parameters = (make_param_getter(i) for i in range(10))
@pytest.mark.parametrize('a', [lazy_value(f) for f in many_lazy_parameters])
def test_foo(a):
print(a)
Note that lazy_value
also has an id
argument if you wish to customize the test ids. The default is to use the function __name__
, and a support for partial functions is on the way.
You can parametrize fixtures the same way, but remember that you have to use @fixture_plus
instead of @pytest.fixture
. See pytest-cases
documentation for details.
I'm the author of pytest-cases
by the way ;)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With