Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does my simple, finite hypothesis test never stop?

Tags:

I am running a test suite with hypothesis-4.24.6 and pytest-5.0.0. My test has a finite set of possible inputs, but hypothesis never finishes testing.

I have reduced it to the following minimal example, which I run as pytest test.py

from hypothesis import given
import hypothesis.strategies as st


@given(x=st.just(0)
         | st.just(1),
       y=st.just(0)
         | st.just(1)
         | st.just(2))
def test_x_y(x, y):
    assert True

I would expect it to try all six combinations here and then succeed. Or possibly a small multiple of that to check for flakiness. Instead it runs indefinitely, (after about 15 mins of testing I kill it.)

If I interrupt the test, back traces seem to show it just continuously generating new examples.

What have I done wrong here?

like image 395
tahsmith Avatar asked Jul 01 '19 07:07

tahsmith


People also ask

How many times should you test a hypothesis?

For a typical experiment, you should plan to repeat the experiment at least three times. The more you test the experiment, the more valid your results.

Which of the following will increase the power of a hypothesis test?

The power of a test can be increased in a number of ways, for example increasing the sample size, decreasing the standard error, increasing the difference between the sample statistic and the hypothesized parameter, or increasing the alpha level.

How do you test the hypothesis at 0.05 level of significance?

To graph a significance level of 0.05, we need to shade the 5% of the distribution that is furthest away from the null hypothesis. In the graph above, the two shaded areas are equidistant from the null hypothesis value and each area has a probability of 0.025, for a total of 0.05.

Under what circumstances will we reject H0?

When your p-value is less than or equal to your significance level, you reject the null hypothesis.


1 Answers

This seems to be connected to the amount of successful tests hypothesis tries to generate:

>>> from hypothesis import given, strategies as st
>>> @given(st.integers(0,1), st.integers(0,2))
... def test(x, y):
...   print(x, y)
...   assert True
... 
>>> test()
0 0
1 1
1 0
1 2
1 1
0 1
0 0
1 2
0 2
0 2
1 0
1 2
0 1
0 1
1 2
[snip…]

See, this part of the docs, for instance, the default amount of successful test cases should be 100. So trying to generate more and more data to only restrict to 6 cases is rapidly failing to find one of these 6 cases.

The simplest approach can be to just limit the amount of examples needed for this test to pass:

>>> from hypothesis import settings
>>> @settings(max_examples=30)
... @given(st.integers(0,1), st.integers(0,2))
... def test(x, y):
...   print(x, y)
...   assert True
... 
>>> test()
0 0
1 1
1 0
0 2
1 2
0 1
0 1
1 1
1 0
1 1
0 1
1 2
1 1
0 0
0 2
0 2
0 0
1 2
1 0
0 1
1 0
1 0
0 1
1 2
1 1
0 2
0 0
1 2
0 0
0 2

An other approach, given the few amount of test cases, would be to explicit them all using @example and ask hypothesis to only run those explicit examples:

>>> from hypothesis import given, example, settings, Phase, strategies as st
>>> @settings(phases=(Phase.explicit,))
... @given(x=st.integers(), y=st.integers())
... @example(x=0, y=0)
... @example(x=0, y=1)
... @example(x=0, y=2)
... @example(x=1, y=0)
... @example(x=1, y=1)
... @example(x=1, y=2)
... def test(x, y):
...   print(x, y)
...   assert True
... 
>>> test()
0 0
0 1
0 2
1 0
1 1
1 2

Also note that st.just(0) | st.just(1) is equivalent to st.one_of(st.just(0), st.just(1)) so choose an approach and stick to it, but don't mix them.

like image 55
301_Moved_Permanently Avatar answered Nov 15 '22 06:11

301_Moved_Permanently