What type of significance test is used in scipy.stats.spearmanr to produce the p-value it spits out? The documentation simply says that its a two-sided p-value, but with respect to what distribution? Is it a t-distribution?
According to the documentation,
the p-value roughly indicates the probability of an uncorrelated system producing datasets that have a Spearman correlation at least as extreme as the one computed from these datasets. The p-values are not entirely reliable but are probably reasonable for datasets larger than 500 or so.
When you look into the source code, you can see that they calculate a t-value:
% rs is rho
t = rs * np.sqrt((n-2) / ((rs+1.0)*(1.0-rs)))
and then calculate the p value assuming a t-distribution with two degrees of freedom:
prob = distributions.t.sf(np.abs(t),n-2)*2
This is also explained on Wikipedia as one option for calculating statistical significance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With