Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Behavior of the `quantile` function in R

Tags:

r

quantile

When working on a problem I noticed something interesting. I dont know what exactly happens, but something happens that I did not expect to happen. It is possible that I made a mistake, but let me start by an example:

x <- rnorm( 100 )
y <- x[ x > quantile( x, 0.1 ) ]
z <- x[ x > quantile( x, c( 0.1, 0.2 ) ) ]
a <- x[ x > quantile( x, c( 0.1, 0.2, 0.3 ) ) ]

We get three different results, but how to interprete these results. Are these the limits that are used?

UPDATE: I think i am asking the wrong question. How can we explain the following:

> x <- rnorm( 100 )
> length( x[ x > quantile( x, 0.1 ) ] )
[1] 90
> length( x[ x > quantile( x, 0.2 ) ] )
[1] 80
> length( x[ x > quantile( x, c( 0.1, 0.2 ) ) ] )
[1] 85
like image 686
Sam Avatar asked Dec 26 '22 07:12

Sam


1 Answers

You're confused about > and R's recycling behavior. When quantile returns more than 1 value (as in the last two examples) it recycles those vectors to be the same length as x in order to make the vectorized comparison via >.

So, in the last two examples, it repeats the 2 or 3 values from quantile over and over again until the resulting vector is the same length as x and them compares them element-wise with >.

Edit

Maybe my explanation wasn't clear enough. In the last line of your edit, x > quantile( x, c( 0.1, 0.2 ) ) R is comparing the first element of x with the 0.1 quantile, the second element of x with the 0.2 quantile, the third element of x with the 0.1 quantile, the 4th element of x with the 0.2 quantile, and so on. Got it? :)

like image 190
joran Avatar answered Jan 07 '23 13:01

joran