Say have a linear model LM that I want a qq plot of the residuals. Normally I would use the R base graphics:
qqnorm(residuals(LM), ylab="Residuals") qqline(residuals(LM))
I can figure out how to get the qqnorm part of the plot, but I can't seem to manage the qqline:
ggplot(LM, aes(sample=.resid)) + stat_qq()
I suspect I'm missing something pretty basic, but it seems like there ought to be an easy way of doing this.
EDIT: Many thanks for the solution below. I've modified the code (very slightly) to extract the information from the linear model so that the plot works like the convenience plot in the R base graphics package.
ggQQ <- function(LM) # argument: a linear model { y <- quantile(LM$resid[!is.na(LM$resid)], c(0.25, 0.75)) x <- qnorm(c(0.25, 0.75)) slope <- diff(y)/diff(x) int <- y[1L] - slope * x[1L] p <- ggplot(LM, aes(sample=.resid)) + stat_qq(alpha = 0.5) + geom_abline(slope = slope, intercept = int, color="blue") return(p) }
qqnorm creates a Normal Q-Q plot. You give it a vector of data and R plots the data in sorted order versus quantiles from a standard Normal distribution. For example, consider the trees data set that comes with R. It provides measurements of the girth, height and volume of timber in 31 felled black cherry trees.
The quantile-quantile plot is a graphical method for determining whether two samples of data came from the same population or not. A q-q plot is a plot of the quantiles of the first data set against the quantiles of the second data set. By a quantile, we mean the fraction (or percent) of points below the given value.
The purpose of the quantile-quantile (QQ) plot is to show if two data sets come from the same distribution. Plotting the first data set's quantiles along the x-axis and plotting the second data set's quantiles along the y-axis is how the plot is constructed.
The following code will give you the plot you want. The ggplot package doesn't seem to contain code for calculating the parameters of the qqline, so I don't know if it's possible to achieve such a plot in a (comprehensible) one-liner.
qqplot.data <- function (vec) # argument: vector of numbers { # following four lines from base R's qqline() y <- quantile(vec[!is.na(vec)], c(0.25, 0.75)) x <- qnorm(c(0.25, 0.75)) slope <- diff(y)/diff(x) int <- y[1L] - slope * x[1L] d <- data.frame(resids = vec) ggplot(d, aes(sample = resids)) + stat_qq() + geom_abline(slope = slope, intercept = int) }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With