Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where can I find the limiting distribution of the Kolmogorov-Smirnov distance in R?

Tags:

r

While doing an experiment on importance sampling, I simulate values of Kolmogorov-Smirnov distances

$$ D_n = \max_x |\hat{F}_n(x)-F(x)| $$

where $n$ is the size of the original importance sample and I want to compare those values to the asymptotic distribution of the Kolmogorov-Smirnov test, or Kolmogorov distribution, i.e.

$$ \sqrt{N} D_n \longrightarrow \sup_{t\in[0,1]}|B(t)| $$

where $B$ is the Brownian bridge.

Since ks.test relies on this asymptotic distribution, its cdf is already present somewhere in R and I would like to know how to access it. The R function ks.test contains the instruction

PVAL <- 1 - if (alternative == "two.sided") 
                .Call(C_pKolmogorov2x, STATISTIC, n)

but my own call to C_pKolmogorov2x does not work.

like image 577
Xi'an Avatar asked Jul 11 '14 10:07

Xi'an


People also ask

How do you do a Kolmogorov-Smirnov test in R?

The Kolmogorov-Smirnov test is used to test whether or not or not a sample comes from a certain distribution. To perform a one-sample or two-sample Kolmogorov-Smirnov test in R we can use the ks. test() function.

How do you interpret the p value for Kolmogorov-Smirnov?

The p-value is the probability of obtaining a test statistic (such as the Kolmogorov-Smirnov statistic) that is at least as extreme as the value that is calculated from the sample, when the data are normal. Larger values for the Kolmogorov-Smirnov statistic indicate that the data do not follow the normal distribution.

What does the Kolmogorov Smirnov statistic show?

“The Kolmogorov–Smirnov statistic quantifies a distance between the empirical distribution function of the sample and the cumulative distribution function of the reference distribution, or between the empirical distribution functions of two samples.”


1 Answers

Relevant excerpt from the "Writing R extensions" manual

Then, the directive in the NAMESPACE file

useDynLib(myDLL, .registration = TRUE)

causes the DLL to be loaded and also for the R variables foo, bar_sym, R_call_sym and R_version_sym to be defined in the package’s namespace.

Translated to human speak this means (roughly) that the default place for all the non-R code is in the package namespace. Hence the need of triple colon.

So if you find in the code the .Call(something,args), you can invoke it from the comandline by .Call(package:::something,args). This is why simple call to C_pKolmogorovx did not work. The R was not finding it, since package namespace is intended for the package, not the user.

If you want to find out where the external code lies you need to look into 2 files. First NAMESPACE of the package to see whether the useDynLib is used to register the external code functions, and then look into src/init.c file where all the available external code functions from the package are registered.

like image 50
mpiktas Avatar answered Nov 15 '22 06:11

mpiktas