I want to create the next histogram density plot with ggplot2
. In the "normal" way (base packages) is really easy:
set.seed(46) vector <- rnorm(500) breaks <- quantile(vector,seq(0,1,by=0.1)) labels = 1:(length(breaks)-1) den = density(vector) hist(df$vector, breaks=breaks, col=rainbow(length(breaks)), probability=TRUE) lines(den)
With ggplot I have reached this so far:
seg <- cut(vector,breaks, labels=labels, include.lowest = TRUE, right = TRUE) df = data.frame(vector=vector,seg=seg) ggplot(df) + geom_histogram(breaks=breaks, aes(x=vector, y=..density.., fill=seg)) + geom_density(aes(x=vector, y=..density..))
But the "y" scale has the wrong dimension. I have noted that the next run gets the "y" scale right.
ggplot(df) + geom_histogram(breaks=breaks, aes(x=vector, y=..density.., fill=seg)) + geom_density(aes(x=vector, y=..density..))
I just do not understand it. y=..density..
is there, that should be the height. So why on earth my scale gets modified when I try to fill it?
I do need the colours. I just want a histogram where the breaks and the colours of each block are directionally set according to the default ggplot fill colours.
A basic histogram can be created with the hist function. In order to add a normal curve or the density line you will need to create a density histogram setting prob = TRUE as argument.
You can also make histograms by using ggplot2 , “a plotting system for R, based on the grammar of graphics” that was created by Hadley Wickham. This post will focus on making a Histogram With ggplot2.
A density plot is a representation of the distribution of a numeric variable. It is a smoothed version of the histogram and is used in the same kind of situation. Here is a basic example built with the ggplot2 library. Density Section Density theory. Density plots are built in ggplot2 thanks to the geom_density geom.
Manually, I added colors to your percentile bars. See if this works for you.
library(ggplot2) ggplot(df, aes(x=vector)) + geom_histogram(breaks=breaks,aes(y=..density..),colour="black",fill=c("red","orange","yellow","lightgreen","green","darkgreen","blue","darkblue","purple","pink")) + geom_density(aes(y=..density..)) + scale_x_continuous(breaks=c(-3,-2,-1,0,1,2,3)) + ylab("Density") + xlab("df$vector") + ggtitle("Histogram of df$vector") + theme_bw() + theme(plot.title=element_text(size=20), axis.title.y=element_text(size = 16, vjust=+0.2), axis.title.x=element_text(size = 16, vjust=-0.2), axis.text.y=element_text(size = 14), axis.text.x=element_text(size = 14), panel.grid.major = element_blank(), panel.grid.minor = element_blank())
fill=seg
results in grouping. You are actually getting a different histogram for each value of seg
. If you don't need the colours, you could use this:
ggplot(df) + geom_histogram(breaks=breaks,aes(x=vector,y=..density..), position="identity") + geom_density(aes(x=vector,y=..density..))
If you need the colours, it might be easiest to calculate the density values outside of ggplot2.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With