Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Quantile Function for a Vector of Dates

Tags:

date

sorting

r

I noticed that the base R quantile function does not support date arguments.

I appreciate that defining quantiles for dates needs care in the definitions (i.e. if you have 6 dates and ask for the 25th percentile, you need to define the suitable rounding).

Is there an efficient implementation of such a quantile function, either as a part of base or another package.

The following sample function achieves essentially what I am interested in (with some tweaking to handle the case of the 0'th percentile), but I imagine that more efficient implementations are possible.

#Date quantile function.
dquantile <- function(x, probs){

  sx <- sort(x)

  pos <- round( probs * length(x) )

  return( sx[pos] )
}

# Example.
dates <- as.Date("01/01/1900", "%d/%m/%Y") + floor( 36500 * runif(100000) )

dquantile(dates, c(0.001, 0.025, 0.975, 0.999) )
like image 612
owen88 Avatar asked Mar 21 '18 11:03

owen88


People also ask

What does quantile () do in R?

quantile() function in R Language is used to create sample quantiles within a data set with probability[0, 1]. Such as first quantile is at 0.25[25%], second is at 0.50[50%], and third is at 0.75[75%].

What is the formula of quantile?

Quantiles of a population. Pr[X ≤ x] ≥ k/q. For a finite population of N equally probable values indexed 1, …, N from lowest to highest, the k-th q-quantile of this population can equivalently be computed via the value of Ip = N k/q.

What is quantile function?

In probability and statistics, the quantile function, associated with a probability distribution of a random variable, specifies the value of the random variable such that the probability of the variable being less than or equal to that value equals the given probability.


1 Answers

The quantile function does support dates, you just need to specify the type argument. Your problem can be solved with:

dates <- as.Date("01/01/1900", "%d/%m/%Y") + floor( 36500 * runif(100000) )

quantile(dates, probs = c(0.001, 0.025, 0.975, 0.999), type = 1)

        0.1%         2.5%        97.5%        99.9% 
"1900-02-04" "1902-06-23" "1997-06-10" "1999-10-30" 
like image 63
Jonny Avatar answered Oct 05 '22 12:10

Jonny