I'm starting to use pytest to add unit test to a software that can analyse different kind of datasets.
I wrote a set of test functions that I would like to apply to different datasets. One complication is that the datasets are quite big, so I would like to do:
and so on.
Right now I'm able to use one dataset using a fixture:
@pytest.fixture(scope="module")
def data():
return load_dataset1()
and then passing data
to each test function.
I know that I can pass the params
keyword to pytest.fixture
. But, how can I implement the sequential load of the different datasets (not loading all of them in RAM at the same time)?
By using @Factory and @DataProvider annotation of TestNG you can execute same test-case multiple times with different data.
A simple evaluation method is a train test dataset where the dataset is divided into a train and a test dataset, then the learning model is trained using the train data and performance is measured using the test data. In a more sophisticated approach, the entire dataset is used to train and test a given model.
Use params
as you mentioned:
@pytest.fixture(scope='module', params=[load_dataset1, load_dataset2])
def data(request):
loader = request.param
dataset = loader()
return dataset
Use fixture finalization
if you want to do fixture specific finalization:
@pytest.fixture(scope='module', params=[load_dataset1, load_dataset2])
def data(request):
loader = request.param
dataset = loader()
def fin():
# finalize dataset-related resource
pass
request.addfinalizer(fin)
return dataset
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With