I have managed to find online how to overlay a normal curve to a histogram in R, but I would like to retain the normal "frequency" y-axis of a histogram. See two code segments below, and notice how in the second, the y-axis is replaced with "density". How can I keep that y-axis as "frequency", as it is in the first plot. AS A BONUS: I'd like to mark the SD regions (up to 3 SD) on the density curve as well. How can I do this? I tried <code>abline</code>, but the line extends to the top of the graph and looks ugly. <pre class="prettyprint"><code>g = d$mydata hist(g) </code></pre> <img src="https://i.stack.imgur.com/7bl5b.png" alt="enter image description here"> <pre class="prettyprint"><code>g = d$mydata m<-mean(g) std<-sqrt(var(g)) hist(g, density=20, breaks=20, prob=TRUE, xlab="x-variable", ylim=c(0, 2), main="normal curve over histogram") curve(dnorm(x, mean=m, sd=std), col="darkblue", lwd=2, add=TRUE, yaxt="n") </code></pre> <img src="https://i.stack.imgur.com/1dfiE.png" alt="enter image description here"> See how in the image above, the y-axis is "density". I'd like to get that to be "frequency".

Here's a nice easy way I found: <pre class="prettyprint"><code>h <- hist(g, breaks = 10, density = 10, col = "lightgray", xlab = "Accuracy", main = "Overall") xfit <- seq(min(g), max(g), length = 40) yfit <- dnorm(xfit, mean = mean(g), sd = sd(g)) yfit <- yfit * diff(h$mids[1:2]) * length(g) lines(xfit, yfit, col = "black", lwd = 2) </code></pre>

Overlay normal curve to histogram in R

Tags:

plot

r

histogram

gaussian

I have managed to find online how to overlay a normal curve to a histogram in R, but I would like to retain the normal "frequency" y-axis of a histogram. See two code segments below, and notice how in the second, the y-axis is replaced with "density". How can I keep that y-axis as "frequency", as it is in the first plot.

AS A BONUS: I'd like to mark the SD regions (up to 3 SD) on the density curve as well. How can I do this? I tried abline, but the line extends to the top of the graph and looks ugly.

g = d$mydata hist(g)

enter image description here

g = d$mydata m<-mean(g) std<-sqrt(var(g)) hist(g, density=20, breaks=20, prob=TRUE,       xlab="x-variable", ylim=c(0, 2),       main="normal curve over histogram") curve(dnorm(x, mean=m, sd=std),        col="darkblue", lwd=2, add=TRUE, yaxt="n")

enter image description here

See how in the image above, the y-axis is "density". I'd like to get that to be "frequency".

334

asked Nov 19 '13 17:11

StanLe

2 Answers

Here's a nice easy way I found:

h <- hist(g, breaks = 10, density = 10,           col = "lightgray", xlab = "Accuracy", main = "Overall")  xfit <- seq(min(g), max(g), length = 40)  yfit <- dnorm(xfit, mean = mean(g), sd = sd(g))  yfit <- yfit * diff(h$mids[1:2]) * length(g)   lines(xfit, yfit, col = "black", lwd = 2)

100

answered Sep 20 '22 08:09

StanLe

You need to find the right multiplier to convert density (an estimated curve where the area beneath the curve is 1) to counts. This can be easily calculated from the hist object.

myhist <- hist(mtcars$mpg) multiplier <- myhist$counts / myhist$density mydensity <- density(mtcars$mpg) mydensity$y <- mydensity$y * multiplier[1]  plot(myhist) lines(mydensity)

enter image description here

A more complete version, with a normal density and lines at each standard deviation away from the mean (including the mean):

myhist <- hist(mtcars$mpg) multiplier <- myhist$counts / myhist$density mydensity <- density(mtcars$mpg) mydensity$y <- mydensity$y * multiplier[1]  plot(myhist) lines(mydensity)  myx <- seq(min(mtcars$mpg), max(mtcars$mpg), length.out= 100) mymean <- mean(mtcars$mpg) mysd <- sd(mtcars$mpg)  normal <- dnorm(x = myx, mean = mymean, sd = mysd) lines(myx, normal * multiplier[1], col = "blue", lwd = 2)  sd_x <- seq(mymean - 3 * mysd, mymean + 3 * mysd, by = mysd) sd_y <- dnorm(x = sd_x, mean = mymean, sd = mysd) * multiplier[1]  segments(x0 = sd_x, y0= 0, x1 = sd_x, y1 = sd_y, col = "firebrick4", lwd = 2)

answered Sep 21 '22 08:09

Gregor Thomas

Related questions
                            
                                Changing factor levels with dplyr mutate
                            
                                How to create a lag variable within each group?
                            
                                R cmd check note: unable to verify current time
                            
                                Why use c() to define vector?
                            
                                How do I check the existence of a downloaded file
                            
                                Subscript out of bounds - general definition and solution?
                            
                                .EACHI in data.table?
                            
                                How to create an empty matrix in R?
                            
                                How to install multiple packages?
                            
                                Imported a csv-dataset to R but the values becomes factors
                            
                                Access and preserve list names in lapply function
                            
                                How to see data from .RData file?
                            
                                Using Caret Package but Getting Error in library(e1071)
                            
                                Set certain values to NA with dplyr
                            
                                Convert named list to vector with values only
                            
                                How do I copy and paste data into R from the clipboard?
                            
                                How to subtract/add days from/to a date?
                            
                                R + Shiny which hammer? straight Shiny, flexdashboard or shinydashboard?
                            
                                Explain the quantile() function in R
                            
                                When does it pay off to use S4 methods in R programming

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Overlay normal curve to histogram in R

Tags:

plot

r

histogram

gaussian

StanLe

People also ask

2 Answers

StanLe

Gregor Thomas

Recent Activity

Donate For Us