Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Making a Python unit test that never runs in parallel

tl;dr - I want to write a Python unittest function that deletes a file, runs a test, and the restores the file. This causes race conditions because unittest runs multiple tests in parallel, and deleting and creating the file for one test messes up other tests that happen at the same time.

Long Specific Example:

I have a Python module named converter.py and it has associated tests in test_converter.py. If there is a file named config_custom.csv in the same directory as converter.py, then the custom configuration will be used. If there is no custom CSV config file, then there is a default configuration built into converter.py.

I wrote a unit test using unittest from the Python 2.7 standard library to validate this behavior. The unit test in setUp() would rename config_custom.csv to wrong_name.csv, then it would run the tests (hopefully using the default config), then in tearDown() it would rename the file back the way it should be.

Problem: Python unit tests run in parallel, and I got terrible race conditions. The file config_custom.csv would get renamed in the middle of other unit tests in a non-deterministic way. It would cause at least one error or failure about 90% of the time that I ran the entire test suite.

The ideal solution would be to tell unittest: Do NOT run this test in parallel with other tests, this test is special and needs complete isolation.

My work-around is to add an optional argument to the function that searches for config files. The argument is only passed by the test suite. It ignores the config file without deleting it. Actually deleting the test file is more graceful, that is what I actually want to test.

like image 311
SerMetAla Avatar asked Feb 14 '14 08:02

SerMetAla


2 Answers

I frequently need to run tests with complete isolation. The only way I've found that works consistently is to put those tests in separate classes. Agreed that handling config files inside tests is still kind of a pain.

For something I'm doing right now, I may also try something like pytest-ordering for running tests in a more deterministic fashion.

like image 136
szeitlin Avatar answered Oct 12 '22 00:10

szeitlin


The Django 3.2 documentation contains an official solution to this problem using the django.test.testcases.SerializeMixin. This will force certain tests to run in series and prevent errors arising when they try to access the same resource.

As per the Documentation, you would do something like this:

import os

from django.test import TestCase
from django.test.testcases import SerializeMixin

class ImageTestCaseMixin(SerializeMixin):
    lockfile = __file__

    def setUp(self):
        self.filename = os.path.join(temp_storage_dir, 'my_file.png')
        self.file = create_file(self.filename)

class RemoveImageTests(ImageTestCaseMixin, TestCase):
    def test_remove_image(self):
        os.remove(self.filename)
        self.assertFalse(os.path.exists(self.filename))

class ResizeImageTests(ImageTestCaseMixin, TestCase):
    def test_resize_image(self):
        resize_image(self.file, (48, 48))
        self.assertEqual(get_image_size(self.file), (48, 48))
like image 21
Lance Avatar answered Oct 12 '22 01:10

Lance