this forum already helped me a lot for producing the code, which I expected to return a histogram of a specific variable overlayed with its empirical normal curve. I used ggplot2 and stat_function to write the code. Unfortunately, the code produced a plot with the correct histogram but the normal curve is a straight line at zero (red line in plot produced by the following code).
For this minimal example I used the mtcars dataset - the same behavior of ggplot and stat_function is observed with my original data set.
This is the code is wrote and used:
library(ggplot2)
mtcars
hist_staff <- ggplot(mtcars, aes(x = mtcars$mpg)) +
geom_histogram(binwidth = 2, colour = "black", aes(fill = ..count..)) +
scale_fill_gradient("Count", low = "#DCDCDC", high = "#7C7C7C") +
stat_function(fun = dnorm, colour = "red")
print(hist_staff)
I also tried to specify dnorm:
stat_function(fun = dnorm(mtcars$mpg, mean = mean(mtcars$mpg), sd = sd(mtcars$mpg))
That did not work out either - an error message returned stating that the arguments are not numerical.
I hope you people can help me! Thanks a lot in advance!
Best, Jannik
A basic histogram can be created with the hist function. In order to add a normal curve or the density line you will need to create a density histogram setting prob = TRUE as argument.
In order to create a normal curve, we create a ggplot base layer that has an x-axis range from -4 to 4 (or whatever range you want!), and assign the x-value aesthetic to this range ( aes(x = x) ). We then add the stat_function option and add dnorm to the function argument to make it a normal curve.
You can also make histograms by using ggplot2 , “a plotting system for R, based on the grammar of graphics” that was created by Hadley Wickham. This post will focus on making a Histogram With ggplot2.
Your curve and histograms are on different y scales and you didn't check the help page on stat_function
, otherwise you'd've put the arguments in a list
as it clearly shows in the example. You also aren't doing the aes
right in your initial ggplot
call. I sincerely suggest hitting up more tutorials and books (or at a minimum the help pages) vs learn ggplot piecemeal on SO.
Once you fix the stat_function
arg problem and the ggplot``aes
issue, you need to tackle the y axis scale difference. To do that, you'll need to switch the y for the histogram to use the density from the underlying stat_bin
calculated data frame:
library(ggplot2)
gg <- ggplot(mtcars, aes(x=mpg))
gg <- gg + geom_histogram(binwidth=2, colour="black",
aes(y=..density.., fill=..count..))
gg <- gg + scale_fill_gradient("Count", low="#DCDCDC", high="#7C7C7C")
gg <- gg + stat_function(fun=dnorm,
color="red",
args=list(mean=mean(mtcars$mpg),
sd=sd(mtcars$mpg)))
gg
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With