Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Test Probabilistic Functions

I need a function which returns an array in random order. I want to ensure that it is randomish but I have no idea how one would go about writing the tests to ensure that the array really is random. I can run the code a bunch of times and see if I have the same answer more than once. While collisions are unlikely for large arrays it is highly probable for small arrays (say two elements).

How should I go about it?

like image 554
stimms Avatar asked Mar 13 '09 02:03

stimms


3 Answers

Cedric recommends an approach where you run the function enough times to get a statistically significant sample and verify the properties of you samples.

So for shuffling, you'd probably want to verify that the relationship between the elements have very small covariance, that the expected position of each element is N/2, etc.

like image 107
Ryan Avatar answered Nov 09 '22 09:11

Ryan


Basically the trick is to extract the randomness out of the class you're testing. This will allow you to test the class by injecting the formula for the randomness from your test which of course wouldn't be random at all.

C# example:

public static List<int> Randomise(List<int> list, Func<bool> randomSwap)
{
    foreach(int i in list)
    {
        if (randomSwap)
        {
            //swap i and i+1;
        }
    }
    return list;
}

Pseudo Usage:

list = Randomise(list, return new Random(0, 1));
like image 21
Jonathan Parker Avatar answered Nov 09 '22 10:11

Jonathan Parker


Other articles have recommended using a fixed seed for the random number generator, mocking the random number generator. These are fine recommendations, and I often follow them. Sometimes, however, I will test the randomness instead.

Given a target array that you want to populate randomly from a source array consider doing the following. Load the source array with consecutive integers. Create a third array called 'sum' and load it with zeros. Now randomly populate the target and then add each element of the target to the corresponding element of sum. Do that another thousand times. If the distribution is really random, then the sums should all be roughly the same. You can do a simple -delta < expected < +delta comparison on each element of the sum array.

You can also do a mean and stdev of the elements of the sum array and do a delta comparison of them too.

If you set the limits right, and do enough iterations, this will suffice nicely. You might be tempted into thinking it can give you a false negative, but if you set the limits correctly it will be more likely for a cosmic ray to alter the execution of the program.

like image 24
Uncle Bob Avatar answered Nov 09 '22 09:11

Uncle Bob