how to get the indices of values by the quantile?

https://stackoverflow.com/questions/20112336

03-08-2022
|

Question

For example if my data looks like this:

> a <- c(1:25)
> a
[1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

How do i get a list like this:

1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5

So I want to divide the 25 elements in to 5 sublists, and find out the index of the sublist that each element belongs to. Data is not sorted and is too large to sort. There are also missing values, in which case their index would be 0.

Sorry, to clarify, I dont need the groups to have equal sizes, but they need to be divided by the 0.2, 0.4, 0.6, 0.8 quantiles.

i.e. the ith element in my output should be the nth quantile that the ith element in a belongs to. For example, 8 is in the second quantile, the 8th element in my output is 2.

No correct solution

OTHER TIPS

Perhaps:

 acut <- cut(a, 
             quantile(a, probs=c(0, 0.2, 0.4, 0.6, 0.8, 1) ) , 
             include.lowest=TRUE)

 as.numeric(acut)

# random data with 3 NAs
> a<-sample(c(NA,NA,NA,sample(1:1000,25)))
> a
 [1] 414 744 897 777  20 371 625 462 341 766  NA 243  NA 213 198 691  NA 325 275 526 830 179  40 601  51 725  68 709
> b<-ceiling(rank(a,na.last="keep")/length(which(!is.na(a)))*5)
> b[is.na(b)]=0
> b
 [1]  3  5  5  5  1  3  4  3  3  5 NA  2 NA  2  2  4 NA  2  2  3  5  1  1  4  1  4  1  4
# check that all groups have the same size
> table(b)
b
1 2 3 4 5 
5 5 5 5 5

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow