Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Testing thread safety fails with Spock

The subject

I have some code that is decidedly not thread safe:

public class ExampleLoader
{
    private List<String> strings;

    protected List<String> loadStrings()
    {
        return Arrays.asList("Hello", "World", "Sup");
    }

    public List<String> getStrings()
    {
        if (strings == null)
        {
            strings = loadStrings();
        }

        return strings;
    }
}

Multiple threads accessing getStrings() simultaneously are expected to see strings as null, and thus loadStrings() (which is an expensive operation) is triggered multiple times.

The problem

I wanted to make the code thread safe, and as a good citizen of the world I wrote a failing Spock spec first:

def "getStrings is thread safe"() {
    given:
    def loader = Spy(ExampleLoader)
    def threads = (0..<10).collect { new Thread({ loader.getStrings() })}

    when:
    threads.each { it.start() }
    threads.each { it.join() }

    then:
    1 * loader.loadStrings()
}

The above code creates and starts 10 threads that each calls getStrings(). It then asserts that loadStrings() was called only once when all threads are done.

I expected this to fail. However, it consistently passes. What?

After a debugging session involving System.out.println and other boring things, I found that the threads are indeed asynchronous: their run() methods printed in a seemingly random order. However, the first thread to access getStrings() would always be the only thread to call loadStrings().

The weird part

Frustrated after quite some time spent debugging, I wrote the same test again with JUnit 4 and Mockito:

@Test
public void getStringsIsThreadSafe() throws Exception
{
    // given
    ExampleLoader loader = Mockito.spy(ExampleLoader.class);
    List<Thread> threads = IntStream.range(0, 10)
            .mapToObj(index -> new Thread(loader::getStrings))
            .collect(Collectors.toList());

    // when
    threads.forEach(Thread::start);
    threads.forEach(thread -> {
        try {
            thread.join();
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    });

    // then
    Mockito.verify(loader, Mockito.times(1))
            .loadStrings();
}

This test consistently fails due to multiple calls to loadStrings(), as was expected.

The question

Why does the Spock test consistently pass, and how would I go about testing this with Spock?

like image 378
Martin Lehmann Avatar asked Sep 17 '16 23:09

Martin Lehmann


1 Answers

The cause of your problem is that Spock makes methods it spies on synchronized. Particularly, the method MockController.handle(), through which all such calls go, is synchronized. You'll easily notice it if you add a pause and some output to your getStrings() method.

public List<String> getStrings() throws InterruptedException {
    System.out.println(Thread.currentThread().getId() + " goes to sleep");
    Thread.sleep(1000);
    System.out.println(Thread.currentThread().getId() + " awoke");
    if (strings == null) {
        strings = loadStrings();
    }
    return strings;
}

This way Spock inadvertently fixes your concurrency problem. Mockito obviously uses another approach.

A couple of other thoughts on your tests:

First, you don't do much to ensure that all your threads have come to the getStrings() call at the same moment, thus decreasing the probability of collisions. Long time may pass between threads start (long enough for the first one to complete the call before others start it). A better approach would be to use some synchronization primitive to remove the influence of threads startup time. For instance, a CountDownLatch may be of use here:

given:
def final CountDownLatch latch = new CountDownLatch(10)
def loader = Spy(ExampleLoader)
def threads = (0..<10).collect { new Thread({
    latch.countDown()
    latch.await()
    loader.getStrings()
})}

Of course, within Spock it will make no difference, it's just an example on how to do it in general.

Second, the problem with concurrency tests is that they never guarantee that your program is thread safe. The best you can hope for is that such test will show you that your program is broken. But even if the test passes, it doesn't prove thread safety. To increase chances of finding concurrency bugs you may want to run the test many times and gather statistics. Sometimes, such tests only fail once in several thousands or even once in several hundreds thousands of runs. Your class is simple enough to make guesses about its thread safety, but such approach will not work with more complicated cases.

like image 173
Andrew Lygin Avatar answered Nov 06 '22 02:11

Andrew Lygin