Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a better way to create quantile "dummies" / factors in R?

Tags:

r

quantile

i´d like to assign factors representing quantiles. Thus I need them to be numeric. That´s why I wrote the following function, which is basically the answer to my problem:

qdum <- function(v,q){

qd = quantile(v,1:(q)/q)
v = as.data.frame(v)
v$b = 0
names(v) <- c("a","b")
i=1
for (i in 1:q){

    if(i == 1)
        v$b[ v$a < qd[1]] = 1
    else
        v$b[v$a > qd[i-1] & v$a <= qd[i]] = i
}

all = list(qd,v)
return(all)

    }

you may laugh now :) . The returned list contains a variable that can be used to assign every observation to its corresponding quantile. My question is now: is there a better way (more "native" or "core") to do it? I know about quantcut (from the gtools package), but at least with the parameters I got, I ended up with only with those unhandy(? - at least to me) thresholds.

Any feedback thats helps to get better is appreciated!

like image 986
Matt Bannert Avatar asked Oct 22 '10 15:10

Matt Bannert


1 Answers

With base R, use quantiles to figure out the splits and then cut to convert the numeric variable to discrete:

qcut <- function(x, n) {
  cut(x, quantile(x, seq(0, 1, length = n + 1)), labels = seq_len(n),
    include.lowest = TRUE)
}

or if you just want the number:

qcut2 <- function(x, n) {
  findInterval(x, quantile(x, seq(0, 1, length = n + 1)), all.inside = T)
}
like image 182
hadley Avatar answered Nov 16 '22 01:11

hadley