Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I use chi square distribution with C++ Boost library?

I've checked the examples in the Boost website, but they are not what I'm looking for.

To put it simple, I want to see if a number on a die is favored, using 600 rolls, so the average appearances of every number (1 through 6) should be 100.

And I want to use the chi square distribution to check if the die is fair.

Help!, how would I do this please ??

like image 834
sevaxx Avatar asked Jan 17 '10 04:01

sevaxx


1 Answers

Suppose e[i] and o[i] are arrays holding the expected and observed count of rolls for each of the 6 possibilities. In your case, e[i] is 100 for each bin, and o[i] is the number of times i was rolled in your 600 trials.

You then calculate the chi-squared statistic by summing (e[i]-o[i])2/e[i] over the 6 bins. Lets say your o[i] array came out with 105, 95, 102, 98, 98, and 102 counts after doing your 600 trials.

chi2 = 52/100 + 52/100 + 22/100 + 22/100 + 22/100 + 22/100 = .660

You have five degrees of freedom (number of bins minus 1). So you're going to have a declaration like

boost::math::chi_squared mydist(5);

to create the Boost object representing your chi-square distribution.

At this point you would use the cdf accessor function (cumulative distribution function) from the Boost library to look up the p-value corresponding to a chi-squared score of .660 with five degrees of freedom.

p = boost::math::cdf(mydist,.660);

You should get something close to 0.015, which would be interpreted as a (1 - .015) = 98.5% probability of observing a chi-squared score at least as extreme as 0.660, if one assumes the null hypothesis (that the die is fair) holds. So for this set of data, the null hypothesis cannot be rejected with any reasonable confidence level. (Disclaimer: untested code! But if I understand the Boost documentation correctly, this is how it should work.)

like image 102
Jim Lewis Avatar answered Sep 28 '22 08:09

Jim Lewis