Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why `cumsum` doesn't work within groups or facets in ggplot?

Tags:

r

ggplot2

Borrowing example from Plotting cumulative counts in ggplot2

x <- data.frame(A=replicate(200,sample(c("a","b","c"),1)),X=rnorm(200))
ggplot(x,aes(x=X,color=A)) + stat_bin(aes(y=cumsum(..count..)),geom="step")

enter image description here

As you can see, cumsum work across groups & facets. I am wondering why it does that? Clearly ..count.. is done within groups, why cumsum is not when applied on to ..count..? Does ggplot internally cat all ..count.. into a vector and then apply cumsum to it?

How to correctly resolve it without pre processing, e.g. using plyr?

And I don't mind geom is not step, it can be line or even bar as long as the graph is a cumulative plot.

like image 556
colinfang Avatar asked Nov 11 '22 20:11

colinfang


1 Answers

Here's how I handle this with one line of code (ddply and mutate):

df <- data.frame(x=rnorm(1000),kind=sample(c("a","b","c"),1000,replace=T),
         label=sample(1:5,1000,replace=T),attribute=sample(1:2,1000,replace=T))

dfx <- ddply(df,.(kind,label,attribute),mutate,cum=rank(x)/length(x))

ggplot(dfx,aes(x=x))+geom_line(aes(y=cum,color=kind))+facet_grid(label~attribute)
like image 53
PeterK Avatar answered Nov 15 '22 05:11

PeterK