Run same test on multiple datasets

https://stackoverflow.com/questions/22591297

19-06-2023
|

Question

I'm starting to use pytest to add unit test to a software that can analyse different kind of datasets.

I wrote a set of test functions that I would like to apply to different datasets. One complication is that the datasets are quite big, so I would like to do:

Load dataset1
Run tests
Load dataset2
Run tests

and so on.

Right now I'm able to use one dataset using a fixture:

@pytest.fixture(scope="module")
def data():
    return load_dataset1()

and then passing datato each test function.

I know that I can pass the params keyword to pytest.fixture. But, how can I implement the sequential load of the different datasets (not loading all of them in RAM at the same time)?

Solution

Use params as you mentioned:

@pytest.fixture(scope='module', params=[load_dataset1, load_dataset2])
def data(request):
    loader = request.param
    dataset = loader()
    return dataset

Use fixture finalization if you want to do fixture specific finalization:

@pytest.fixture(scope='module', params=[load_dataset1, load_dataset2])
def data(request):
    loader = request.param
    dataset = loader()
    def fin():
        # finalize dataset-related resource
        pass
    request.addfinalizer(fin)
    return dataset

OTHER TIPS

Falsetru's answer is quite good, but because this is a hard problem, I wanted to share a slightly different solution using @pytest.mark.parametrize.

@pytest.fixture(scope="module")
def data1():
    return get_dataset1()

@pytest.fixture(scope="module")
def data2():
    return get_dataset2()

@pytest.mark.parametrize('data_fixture',
                         ['data1','data2'])
def test_datafoo_is_bar(data_fixture, request):
   data = request.getfixturevalue(data_fixture)
   assert data[foo] == bar

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow