Behavior of the `quantile` function in R

https://stackoverflow.com/questions/15298490

r
quantile

18-03-2022
|

Question

When working on a problem I noticed something interesting. I dont know what exactly happens, but something happens that I did not expect to happen. It is possible that I made a mistake, but let me start by an example:

x <- rnorm( 100 )
y <- x[ x > quantile( x, 0.1 ) ]
z <- x[ x > quantile( x, c( 0.1, 0.2 ) ) ]
a <- x[ x > quantile( x, c( 0.1, 0.2, 0.3 ) ) ]

We get three different results, but how to interprete these results. Are these the limits that are used?

UPDATE: I think i am asking the wrong question. How can we explain the following:

> x <- rnorm( 100 )
> length( x[ x > quantile( x, 0.1 ) ] )
[1] 90
> length( x[ x > quantile( x, 0.2 ) ] )
[1] 80
> length( x[ x > quantile( x, c( 0.1, 0.2 ) ) ] )
[1] 85

Solution

You're confused about > and R's recycling behavior. When quantile returns more than 1 value (as in the last two examples) it recycles those vectors to be the same length as x in order to make the vectorized comparison via >.

So, in the last two examples, it repeats the 2 or 3 values from quantile over and over again until the resulting vector is the same length as x and them compares them element-wise with >.

Edit

Maybe my explanation wasn't clear enough. In the last line of your edit, x > quantile( x, c( 0.1, 0.2 ) ) R is comparing the first element of x with the 0.1 quantile, the second element of x with the 0.2 quantile, the third element of x with the 0.1 quantile, the 4th element of x with the 0.2 quantile, and so on. Got it? :)

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow