The Kolmogorov-Smirnov statistic is defined as the maximum distance between the empirical and the hypothesized cumulative distribution function. Rather than looking at numbers, I think it is much preferable to locate the maximum difference using a graph.
I know how to plot the empirical distribution function
p1<-qplot(rnorm(30),stat="ecdf",geom="step")
but could you please tell me how I could add on the same plot the cumulative distribution function of the theoretical distribution? For my case, the theoretical distribution is the standard normal but I am interested in the generalization to every distribution function.
Thank you.
If you want to use ggplot
, just do
set.seed(15)
dd <- data.frame(x=rnorm(30))
ggplot(dd, aes(x)) +
stat_ecdf() +
stat_function(fun = pnorm, colour = "red")
You can find the maximal distance if you like with
ed <- ecdf(dd$x)
maxdiffidx <- which.max(abs(ed(dd$x)-pnorm(dd$x)))
maxdiffat <- dd$x[maxdiffidx]
and add that to the plot with
ggplot(dd, aes(x)) +
stat_ecdf() +
stat_function(fun = pnorm, colour = "red") +
geom_vline(x=maxdiffat, lty=2)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With