Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plotting a 95% confidence interval for a lm object

How can I calculate and plot a confidence interval for my regression in r? So far I have two numerical vectors of equal length (x,y) and a regression object(lm.out). I have made a scatterplot of y given x and added the regression line to this plot. I am looking for a way to add a 95% prediction confidence band for lm.out to the plot. I've tried using the predict function, but I don't even know where to start with that :/. Here is my code at the moment:

x=c(1,2,3,4,5,6,7,8,9,0)
y=c(13,28,43,35,96,84,101,110,108,13)

lm.out <- lm(y ~ x)

plot(x,y)

regression.data = summary(lm.out) #save regression summary as variable
names(regression.data) #get names so we can index this data
a= regression.data$coefficients["(Intercept)","Estimate"] #grab values
b= regression.data$coefficients["x","Estimate"]
abline(a,b) #add the regression line

Thank you!

Edit: I've taken a look at the proposed duplicate and can't quite get to the bottom of it.

like image 781
Max Lester Avatar asked Sep 28 '17 01:09

Max Lester


People also ask

How do you find the 95 confidence interval for a linear regression?

We can use the following formula to calculate a 95% confidence interval for the intercept: 95% C.I. for β0: b0 ± tα/2,n-2 * se(b0) 95% C.I. for β0: 65.334 ± t.05/2,15-2 * 2.106.

How do you plot a confidence interval in a linear regression in R?

To find the confidence interval in R, create a new data. frame with the desired value to predict. The prediction is made with the predict() function. The interval argument is set to 'confidence' to output the mean interval.

What is 95% confidence interval of a regression line?

The 95% confidence interval is commonly interpreted as there is a 95% probability that the true linear regression line of the population will lie within the confidence interval of the regression line calculated from the sample data.


2 Answers

You have yo use predict for a new vector of data, here newx.

x=c(1,2,3,4,5,6,7,8,9,0)

y=c(13,28,43,35,96,84,101,110,108,13)

lm.out <- lm(y ~ x)
newx = seq(min(x),max(x),by = 0.05)
conf_interval <- predict(lm.out, newdata=data.frame(x=newx), interval="confidence",
                         level = 0.95)
plot(x, y, xlab="x", ylab="y", main="Regression")
abline(lm.out, col="lightblue")
lines(newx, conf_interval[,2], col="blue", lty=2)
lines(newx, conf_interval[,3], col="blue", lty=2)

EDIT

as it is mention in the coments by Ben this can be done with matlines as follow:

plot(x, y, xlab="x", ylab="y", main="Regression")
abline(lm.out, col="lightblue")
matlines(newx, conf_interval[,2:3], col = "blue", lty=2)
like image 85
Alejandro Andrade Avatar answered Oct 08 '22 12:10

Alejandro Andrade


I'm going to add a tip that would have saved me a lot of frustration when trying the method given by @Alejandro Andrade: If your data are in a data frame, then when you build your model with lm(), use the data= argument rather than $ notation. E.g., use

lm.out <- lm(y ~ x, data = mydata)

rather than

lm.out <- lm(mydata$y ~ mydata$x)

If you do the latter, then this statement

predict(lm.out, newdata=data.frame(x=newx), interval="confidence", level = 0.95)

seems to either ignore the new values passed using newdata= or there's a silent error. Either way, the output is the predictions from the original data, not the new data.

Also, be sure your x variable gets the same name in the new data frame that it had in the original. That's easier to figure out because you do get an error, but knowing it ahead of time might save you a round of debugging.

Note: Tried to add this as a comment, but don't have enough reputation points.

like image 22
acullum Avatar answered Oct 08 '22 11:10

acullum