Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R ggplot2 using ..count.. when using facet_grid

Tags:

plot

r

ggplot2

I am using R studio in Ubuntu, with standard updated R and ggplot2

I try to create a histogram in ggplot, and to separate the data by groups.

I need the plot's y axis to say the frequency of each bin in the subgroup that was split by the facet grid.

for example if i have two entries in the data

a group
1 1
2 2

I need to use facet_grid to split by group, and then to show that a has one bar for 1 that is 100% percent of the examples in group 1 and vice versa.

I found out that the way to do it, is using (..count..)/sum(..count) but sum(..count..) will count the frequency of that been in the entire data frame and will give me unwanted results,

I can't find good documentation for deep using of ..count..

question about special ggplot variables

another question about ..count..

There is nothing very comprehensive in the docs,

This is the example code i am using

df <- data.frame(a = 1:10, b = 1:10, group = c(rep(1,5),rep(2,5)))
p<-ggplot(df) + geom_histogram(aes(x = a, y = (..count..)/sum(..count..))) +  
   facet_grid(group ~ .)

You can see that the y axis will contain 0.1 as the highest value, i would like it to show that 100% percent of the 1 values are in group 1 for example. etc.

edit:

Thanks to Jimbou for the answer and reference to a well built walk around that is suitable for discrete data, pls note that the real problem i am having here will need to use continuous data, and bins that group more than one value, furthermore, there is no proper documentation about how to do this with the ..count.. function and therefor I believe this is important to find a solution and not to use walk around

like image 377
thebeancounter Avatar asked Jan 28 '26 15:01

thebeancounter


2 Answers

Here is a dplyr solution.

df%>% group_by(group)%>%mutate(n = n(), prop = n/sum(n))
like image 62
shayaa Avatar answered Jan 31 '26 04:01

shayaa


After a lot of playing around, and very good directions you all gave, i found that with a little addition and blend between Jimbou's and Shayaa's answers, and some added code this works beautifully.

t <- data %>% group_by(group,member,v_rate) %>% tally %>% mutate(f = n/sum(n))

will take the data and will group by group, member, v_rate, and will add count of each group divided by the sum (relative frequency in the group)

than we want to create the histogram with ggplot2 and use those values as the weight function of the histogram, otherwise it was all for vain,

 p <- ggplot(t, aes(x = v_rate, weight = f)) + geom_histogram() + facet_grid(group ~ member)

that works great.

like image 36
thebeancounter Avatar answered Jan 31 '26 05:01

thebeancounter



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!