Am I missing something obvious or Matlab's kstest2
is giving very poor p-values?
Under very poor I mean that I have the suspicion that it is even wrongly implemented.
Help page of kstest2
states that the function calculates asymptotic p-value, though I did not find any reference about which method is used exactly. Anyway, the description further states:
asymptotic p-value becomes very accurate for large sample sizes, and is believed to be reasonably accurate for sample sizes n1 and n2, such that (n1*n2)/(n1 + n2) ≥ 4
Example 1
Let's take Example 6 from Lehman and D'Abrera (1975):
sampleA = [6.8, 3.1, 5.8, 4.5, 3.3, 4.7, 4.2, 4.9];
sampleB = [4.4, 2.5, 2.8, 2.1, 6.6, 0.0, 4.8, 2.3];
[h,p,ks2stat] = kstest2(sampleA, sampleB, 'Tail', 'unequal');
(n1*n2)/(n1 + n2) = 4
in this case so the p-value should be reasonably accurate.
Matlab yields p = 0.0497
, while the solution given in the book is 0.0870
.
To validate the solution I used R, which I trust more than Matlab, especially in statistics.
Using ks.test
from stats
package and ks.boot
from Matching
package:
ks.test(sampleA, sampleB, alternative = "two.sided")
ks.boot(sampleA, sampleB, alternative = "two.sided")
Both give p = 0.0870
.
Example 2
Lets use kstest2
's own example to compare Matlab and R results for larger sample size:
rng(1); % For reproducibility
x1 = wblrnd(1,1,1,50);
x2 = wblrnd(1.2,2,1,50);
[h,p,ks2stat] = kstest2(x1,x2);
This yields p = 0.0317
. Now, using the same x1
and x2
vectors R gives p = 0.03968
.
About 20% difference when very accurate result is expected (n1*n2)/(n1 + n2) = 25
.
Am I missing, messing up something?
Is it possible that Matlab's kstest2
performs so poorly as the examples indicate? What approximation, algorithm kstest2
is using? (I can see the implemented code for kstest2, however a reference to book or paper would be much better to understand what is going on.)
I am using Matlab 2016a.
Lehman and D'Abrera (1975). Nonparametrics: Statistical Methods Based on Ranks. 1st edition. Springer.
h = kstest2( x1 , x2 , Name,Value ) returns a test decision for a two-sample Kolmogorov-Smirnov test with additional options specified by one or more name-value pair arguments. For example, you can change the significance level or conduct a one-sided test.
Assumptions. The null hypothesis is both samples are randomly drawn from the same (pooled) set of values. The two samples are mutually independent. The scale of measurement is at least ordinal.
The Kolmogorov-Smirnov test is used to test the null hypothesis that a set of data comes from a Normal distribution. The Kolmogorov Smirnov test produces test statistics that are used (along with a degrees of freedom parameter) to test for normality.
Unlike the Chi-Square test, which can be used for testing against both continuous and discrete distributions, the K-S test is only appropriate for testing data against a continuous distribution, such as the normal or Weibull distribution.
I think that the correct test to compare with R's ks.test
in MATLAB or Octave would be kolmogorov_smirnov_test_2
:
sampleA = [6.8, 3.1, 5.8, 4.5, 3.3, 4.7, 4.2, 4.9];
sampleB = [4.4, 2.5, 2.8, 2.1, 6.6, 0.0, 4.8, 2.3];
kolmogorov_smirnov_test_2(sampleA, sampleB)
pval: 0.0878664
The difference appears to be the use of ks
versus lambda
, i.e.
ks = sqrt (n) * d;
pval = 1 - kolmogorov_smirnov_cdf (ks);
versus
lambda = max((sqrt(n) + 0.12 + 0.11/sqrt(n)) * d , 0);
pval = 1 - kolmogorov_smirnov_cdf (lambda);
I presume the different test statistics arise from the differences in the research papers cited by these 2 functions. If you want a deeper dive into the statistical theory you may want to reach out to CrossValidated.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With