I need to use normaltest in scipy for testing if the dataset is normal distributet. But I cant seem to find any good examples how to use scipy.stats.normaltest
.
My dataset has more than 100 values.
The Anderson–Darling test is a statistical test of whether a given sample of data is drawn from a given probability distribution. In its basic form, the test assumes that there are no parameters to be estimated in the distribution being tested, in which case the test and its set of critical values is distribution-free.
stats ) This module contains a large number of probability distributions, summary and frequency statistics, correlation functions and statistical tests, masked statistics, kernel density estimation, quasi-Monte Carlo functionality, and more.
For quick and visual identification of a normal distribution, use a QQ plot if you have only one variable to look at and a Box Plot if you have many. Use a histogram if you need to present your results to a non-statistical public. As a statistical test to confirm your hypothesis, use the Shapiro Wilk test.
In [12]: import scipy.stats as stats In [13]: x = stats.norm.rvs(size = 100) In [14]: stats.normaltest(x) Out[14]: (1.627533590094232, 0.44318552909231262)
normaltest
returns a 2-tuple of the chi-squared statistic, and the associated p-value. Given the null hypothesis that x
came from a normal distribution, the p-value represents the probability that a chi-squared statistic that large (or larger) would be seen.
If the p-val is very small, it means it is unlikely that the data came from a normal distribution. For example:
In [15]: y = stats.uniform.rvs(size = 100) In [16]: stats.normaltest(y) Out[16]: (31.487039026711866, 1.4543748291516241e-07)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With