We have several different optimization algorithms that produce a different result for each run. For example the goal of the optimization could be to find the minimum of a function, where 0 is the global minima. The optimization runs returns data like this:
[0.1, 0.1321, 0.0921, 0.012, 0.4]
Which is quite close to the global minima, so this is ok. Our first approach was to just choose a threshold, and let the unit test fail if a result occured that was too high. Unfortunately, this does not work at all: The results seem to have a gauss distribution, so, although unlikely, from time to time the test failed even when the algorithm is still fine and we just had bad luck.
So, how can I test this properly? I think quite a bit of statistics are needed here. It is also important that tests are still fast, just letting the test run a few 100 times and then take the average will be too slow.
Here are some further clarifications:
For example I have an algorithm that fits a Circle into a set of points. It is extremly fast but does not always produce the same result. I want to write a Unit test to guarantee that in most cases it is good enough.
Unfortunately I cannot choose a fixed seed for the random number generator, because I do not want to test if the algorithm produces the exact same result as before, but I want to test something like "With 90% certainty I get a result with 0.1 or better".
It sounds like your optimizer needs two kinds of testing:
Since the algorithm involves randomization, (1) is difficult to unit-test. Any test of a random process will fail some proportion of the time. You need to know some statistics to understand just how often it should fail. There are ways to trade off between how strict your test is and how often it fails.
But there are ways to write unit tests for (2). For example, you could reset the seed to a particular value before running your unit tests. Then the output is deterministic. That would not allow you to assess the average effectiveness of the algorithm, but that's for (1). Such a test would serve as a trip wire: if someone introduced a bug into the code during maintenance, a deterministic unit test might catch the bug.
There may be other things that could be unit tested. For example, maybe your algorithm is guaranteed to return values in a certain range no matter what happens with the randomized part. Maybe some value should always be positive, etc.
Update: I wrote a chapter about this problem in the book Beautiful Testing. See Chapter 10: Testing a Random Number Generator.
A unit test should never have an unknown pass/fail state. If your algorithm is returning different values when run with the same inputs multiple times, you are probably doing something screwy in your algorithm.
I would take each of the 5 optimization algorithms and test them to make sure that given a set of inputs x, you get back an optimized value of y every time.
EDIT: To address random components of your system, you can either introduce the ability to pass in the seed for the random number generator to be used, or you can utilize a mocking library (ala RhinoMocks) to force it to use a particular number when the RNG is asked for a random number.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With