Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

qqnorm and qqline in ggplot2

Tags:

r

ggplot2

ggproto

Say have a linear model LM that I want a qq plot of the residuals. Normally I would use the R base graphics:

qqnorm(residuals(LM), ylab="Residuals") qqline(residuals(LM)) 

I can figure out how to get the qqnorm part of the plot, but I can't seem to manage the qqline:

ggplot(LM, aes(sample=.resid)) +     stat_qq() 

I suspect I'm missing something pretty basic, but it seems like there ought to be an easy way of doing this.

EDIT: Many thanks for the solution below. I've modified the code (very slightly) to extract the information from the linear model so that the plot works like the convenience plot in the R base graphics package.

ggQQ <- function(LM) # argument: a linear model {     y <- quantile(LM$resid[!is.na(LM$resid)], c(0.25, 0.75))     x <- qnorm(c(0.25, 0.75))     slope <- diff(y)/diff(x)     int <- y[1L] - slope * x[1L]     p <- ggplot(LM, aes(sample=.resid)) +         stat_qq(alpha = 0.5) +         geom_abline(slope = slope, intercept = int, color="blue")      return(p) } 
like image 921
Peter Avatar asked Dec 05 '10 02:12

Peter


People also ask

How does Qqnorm work in R?

qqnorm creates a Normal Q-Q plot. You give it a vector of data and R plots the data in sorted order versus quantiles from a standard Normal distribution. For example, consider the trees data set that comes with R. It provides measurements of the girth, height and volume of timber in 31 felled black cherry trees.

What is the difference between Q-Q plot and quantile plot?

The quantile-quantile plot is a graphical method for determining whether two samples of data came from the same population or not. A q-q plot is a plot of the quantiles of the first data set against the quantiles of the second data set. By a quantile, we mean the fraction (or percent) of points below the given value.

What does a Qqnorm plot show?

The purpose of the quantile-quantile (QQ) plot is to show if two data sets come from the same distribution. Plotting the first data set's quantiles along the x-axis and plotting the second data set's quantiles along the y-axis is how the plot is constructed.


1 Answers

The following code will give you the plot you want. The ggplot package doesn't seem to contain code for calculating the parameters of the qqline, so I don't know if it's possible to achieve such a plot in a (comprehensible) one-liner.

qqplot.data <- function (vec) # argument: vector of numbers {   # following four lines from base R's qqline()   y <- quantile(vec[!is.na(vec)], c(0.25, 0.75))   x <- qnorm(c(0.25, 0.75))   slope <- diff(y)/diff(x)   int <- y[1L] - slope * x[1L]    d <- data.frame(resids = vec)    ggplot(d, aes(sample = resids)) + stat_qq() + geom_abline(slope = slope, intercept = int)  } 
like image 138
Aaron Avatar answered Sep 24 '22 11:09

Aaron