Normalizing y-axis in histograms in R ggplot to proportion by group

Tags:

My question is very similar to Normalizing y-axis in histograms in R ggplot to proportion, except that I have two groups of data of different size, and I would like that each proportion is relative to its group size instead of the total size.

To make it clearer, let's say I have two sets of data in a data frame:

dataA<-rnorm(100,3,sd=2) dataB<-rnorm(400,5,sd=3) all<-data.frame(dataset=c(rep('A',length(dataA)),rep('B',length(dataB))),value=c(dataA,dataB))

I can plot the two distributions together with:

ggplot(all,aes(x=value,fill=dataset))+geom_histogram(alpha=0.5,position='identity',binwidth=0.5)

and instead of the frequency on the Y axis I can have the proportion with:

ggplot(all,aes(x=value,fill=dataset))+geom_histogram(aes(y=..count../sum(..count..)),alpha=0.5,position='identity',binwidth=0.5)

But this gives the proportion relative to the total data size (500 points here): is it possible to have it relative to each group size?

My goal here is to make it possible to compare visually the proportion of values in a given bin between A and B, independently from their respective size. Ideas which differ from my original one are also welcome!

Thanks!

520

asked Mar 04 '14 19:03

Erwan

1 Answers

Like this? [edited based on OP's comment]

ggplot(all,aes(x=value,fill=dataset))+   geom_histogram(aes(y=0.5*..density..),                  alpha=0.5,position='identity',binwidth=0.5)

Using y=..density.. scales the histograms so the area under each is 1, or sum(binwidth*y)=1. As a result, you would use y = binwidth*..density.. to have y represent the fraction of the total in each bin. In your case, binwidth=0.5.

IMO this is a little easier to interpret:

ggplot(all,aes(x=value,fill=dataset))+   geom_histogram(aes(y=0.5*..density..),binwidth=0.5)+   facet_wrap(~dataset,nrow=2)

161

answered Sep 19 '22 00:09

jlhoward

Related questions
                            
                                Remove parentheses and text within from strings in R
                            
                                R "stats" citation for a scientific paper
                            
                                R create reference manual with R CMD check
                            
                                How to nicely annotate a ggplot2 (manual)
                            
                                Loop in R to read many files
                            
                                How to check existence of an input argument for R functions
                            
                                model.matrix() with na.action=NULL?
                            
                                How to convert character of percentage into numeric in R
                            
                                How to convert data.frame column from Factor to numeric [duplicate]
                            
                                R - Filter a vector using a function
                            
                                Find complement of a data frame (anti - join)
                            
                                Colouring plot by factor in R
                            
                                Logistic Regression PMML won't Produce Probabilities
                            
                                What are the caveats of using source versus parse & eval?
                            
                                FAQ markup to R data structure
                            
                                Why is the diag function so slow? [in R 3.2.0 or earlier]
                            
                                Confused by ...()?
                            
                                R Error: java.lang.OutOfMemoryError: Java heap space
                            
                                Options for deploying R models in production
                            
                                What does the @ symbol mean in R?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Normalizing y-axis in histograms in R ggplot to proportion by group

Tags:

r

ggplot2

histogram

Erwan

People also ask

1 Answers

jlhoward

Recent Activity

Donate For Us