Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Breaks not unique error when using cut and ddply

Tags:

r

cut

plyr

I am trying to break a dataset into quantiles based on a group.

I have the following code which if i try to do a cut using seq(0,1,.5) it works fine but when I change to the seq(0,1,.2) then it gives :

Error in cut.default(x = fwd_quarts$v, breaks = quantile(fwd_quarts$v, : 'breaks' are not unique

Tring different code, I can't get away from the error. How do I adjust this so when it expands to larger data sets that the quantiles will be created without the error?

 ddf <- vector(mode="numeric", length=0)
df <- vector(mode="numeric", length=0)
g<-data.frame( g= c(1,1,1,1,2,2,2,2,3,3))
v<-data.frame( v= c(1,4,4,5,NA,2,6,NA,7,8))
df<-cbind(g,v)
df<-df[complete.cases(df), ]


ddf<-ddply(df, "g", function(fwd_quarts){
  eps_quartile <- cut(x = fwd_quarts$v, breaks =quantile(fwd_quarts$v, probs = seq(0, 1, 0.5)),na.rm=TRUE, labels = FALSE, include.lowest = TRUE)
   cbind(ddf,eps_quartile)
})

df<-cbind(df,fwde_quart=ddf$eps_quartile)
like image 837
jazz_learn Avatar asked Oct 19 '22 18:10

jazz_learn


1 Answers

This has nothing to do with ddply.

If your data is not generating unique breaks, you can make them unique by wrapping the breaks with a unique statement.

breaks =unique(quantile(fwd_quarts$v, probs = seq(0, 1, 0.2)))

However, this will lower the number of levels from what you originally desired.

Generally speaking, if you have data like c(1,1,1,2) you can't break it into 3 groups. The number of groups should be less than or equal to the unique values in your data. HTH.

like image 115
Saurabh Pandey Avatar answered Oct 21 '22 10:10

Saurabh Pandey