Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Test if a data distribution follows a Gaussian distribution in MATLAB

I have some data points and their mean point. I need to find whether those data points (with that mean) follows a Gaussian distribution. Is there a function in MATLAB which can do that kind of a test? Or do I need to write a test of my own?

I tried looking at different statistical functions provided by MATLAB. I am very new to MATLAB so I might have overlooked the right function.

cheers

like image 519
Arnkrishn Avatar asked Dec 10 '09 18:12

Arnkrishn


People also ask

How do I know if my data follows the Gaussian distribution?

You can test the hypothesis that your data were sampled from a Normal (Gaussian) distribution visually (with QQ-plots and histograms) or statistically (with tests such as D'Agostino-Pearson and Kolmogorov-Smirnov).

How do I check if a distribution is normal in Matlab?

h = kstest( x ) returns a test decision for the null hypothesis that the data in vector x comes from a standard normal distribution, against the alternative that it does not come from such a distribution, using the one-sample Kolmogorov-Smirnov test.

How do I know if my data follows a distribution?

If the data points fall along the straight line, you can conclude the data follow that distribution even if the p-value is statistically significant. The probability plots below include the normal distribution, our top two candidates, and the gamma distribution.


2 Answers

Check this documentation page on all available hypothesis tests.

From those, for your purpose you can use:

  • Chi-square goodness-of-fit test
  • Lilliefors test
  • z-test
  • t-test
  • Kolmogorov-Smirnov test

... among others

You can also use some visual tests like:

  • hist
  • normplot
  • cdfplot
like image 162
Amro Avatar answered Oct 16 '22 17:10

Amro


I like Spiegelhalter's test (D. J. Spiegelhalter, 'Diagnostic tests of distributional shape,' Biometrika, 1983):

function pval = spiegel_test(x)
% compute pvalue under null of x normally distributed;
% x should be a vector;
xm = mean(x);
xs = std(x);
xz = (x - xm) ./ xs;
xz2 = xz.^2;
N = sum(xz2 .* log(xz2));
n = numel(x);
ts = (N - 0.73 * n) / (0.8969 * sqrt(n)); %under the null, ts ~ N(0,1)
pval = 1 - abs(erf(ts / sqrt(2)));    %2-sided test.

whenever hacking statistical tests, alway test them under the null! here's a simple example:

pvals = nan(10000,1);
for j=1:numel(pvals);
pvals(j) = spiegel_test(randn(300,1));
end
nnz(pvals < 0.05) ./ numel(pvals)

I get the results:

ans =    
   0.0505

Similarly

nnz(pvals > 0.95) ./ numel(pvals)

I get

ans = 
   0.0475
like image 6
shabbychef Avatar answered Oct 16 '22 18:10

shabbychef