My question is very similar to
Normalizing y-axis in histograms in R ggplot to proportion by group
Except, I need density plots and I would like to have the y-axis as a rate like x counts per 1000 patients.
I have multiple groups of data of different sizes, and I would like that each proportion is relative to its group size instead of the total size.
To make it clearer, let's say I have two sets of data in a data frame
example data:
dataA<-rnorm(10000,3,sd=2)
dataB<-rnorm(40000,5,sd=3)
bp_combi<-data.frame(dataset=c(rep('A',length(dataA)),rep('B',length(dataB))),
value=c(dataA,dataB))
I can plot the distributions together relative to the total size, but not to the relative size.
combi_dens = ggplot(bp_combi,
aes(x=value,
number_of_cases=nrow(bp_combi),
y=(..count..)/number_of_cases*1000, fill=dataset)) +
geom_density(bw = 1, alpha=0.4, size = 1.5 )
is it possible to have it relative to each group size?
Thanks!
For those still interested. The answer is rather simple. First create a separate column with the relative group sizes and use that column in ggplot.
unique_episodes = bp_combi %>% group_by(dataset) %>% count(dataset)
data2 = merge(x = bp_combi, y = unique_episodes, by = "dataset", all.x = TRUE)
combi_dens = ggplot(bp_combi,
aes(x=value,,
y=(..count..)/n*1000, fill=dataset)) +
geom_density(bw = 1, alpha=0.4, size = 1.5 )
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With