Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

quantile vs ecdf results

Tags:

r

quantile

ecdf

I am trying to use ecdf, but I am not sure if I am doing it right. My ultimate purpose is to find what quantile corresponds to a specific value. As an example:

sample_set <- c(20, 40, 60, 80, 100) 
# Now I want to get the 0.75 quantile:
quantile(x = sample_set, probs = 0.75)
#result:
75% 
80
# Let's use ecdf
ecdf(x = sample_set) (80)
#result
0.8

Why is there this discrepancy? Am I doing some trivial mistake, or it depends on the way quantile makes its calculations?

Thanks, Max

like image 976
Max_IT Avatar asked Mar 10 '16 21:03

Max_IT


People also ask

What does an ECDF plot tell you?

What's an ECDF? An ECDF is an estimator of the Cumulative Distribution Function. The ECDF essentially allows you to plot a feature of your data in order from least to greatest and see the whole feature as if is distributed across the data set.

What is ECDF in statistics?

In statistics, an empirical distribution function (commonly also called an empirical Cumulative Distribution Function, eCDF) is the distribution function associated with the empirical measure of a sample. This cumulative distribution function is a step function that jumps up by 1/n at each of the n data points.

What is ECDF vs CDF?

There is a simple, straightforward, elegant explanation in terms of tickets in a box models: the CDF describes what is in the original box. The ECDF is what you get when you put your sample (which is a set of tickets drawn from the original box: so-called "empirical" data) into an empty box.

How is ECDF calculated?

The EDF is calculated by ordering all of the unique observations in the data sample and calculating the cumulative probability for each as the number of observations less than or equal to a given observation divided by the total number of observations. As follows: EDF(x) = number of observations <= x / n.


1 Answers

There are two points. First, as you guessed, it depends on the way quantile makes its calculations. Specifically, it depends on the parameter type. What you might want to choose is type = 1, since then it corresponds to the inverse of empirical distribution function (see ?quantile). Second, since ecdf gives a discrete, step function, i.e. the ecdf is not strictly increasing, you cannot get exact equality because of the way quantile is defined (see the second formula).

like image 91
Julius Vainora Avatar answered Sep 28 '22 20:09

Julius Vainora