Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plotting 99% confidence interval and prediction interval struggle in R

I am using the auto data from R I need to plot the confidence intervals but it is a struggle, this is what I got so far:

I have created the model for linear regression

my_acc<-auto_df$acceleration
my_horse<-auto_df$horsepower
mydata <- data.frame(my_acc, my_horse )

car_linear_regression <- lm(my_acc ~ my_horse, mydata )

I have created the confidence intervals for ONE prediction as the exercise is asking

conf_int<-predict(car_linear_regression,newdata = data.frame(my_horse = 93.5),interval = 'confidence' )
#data.frame(my_horse = 93.5) must be the same as in the original dataframe

pred_int<-predict(car_linear_regression,newdata = data.frame(my_horse = 93.5),interval = 'prediction' )

Then I am trying to plot everthing together but I am totally stuck, I can plot the data with the regression line, but I only get this error

Error in xy.coords(x, y) : 'x' and 'y' lengths differ

plot(my_acc ~ my_horse   , data = mydata, pch = 20, cex  = 1.5, col="blue", xlab=" car horsepower", ylab = "acceleration secs to 100km/h", main = "Confidence intervals and prediction intervals")
abline(car_linear_regression, lwd = 5,  col="red" )

lines(mydata$my_horse, conf_int[,"lwr"], col="red", type="b", pch="+")


like image 313
Emilia Delizia Avatar asked Jan 24 '26 02:01

Emilia Delizia


1 Answers

For the plot you need definitely predictions on the whole range, i.e. min max of horespower.

data('Auto', package='ISLR')  

fo <- acceleration ~ horsepower  ## formula object for re-use

fit <- lm(fo, Auto)

We will need a sequence over the range of predictor horsepower, so a glance into summary is helpful.

summary(Auto)

Then we create a sequence for plotting with a reasonable step size. This will be what lines uses to plot the lines.

n_data <- with(Auto, seq(min(horsepower), max(horsepower), by=1))

Now calculate predictions using the sequences,

conf_int <- predict(fit, newdata=list(horsepower=n_data), 
                    interval='confidence', level=.99)
pred_int <- predict(fit, newdata=list(horsepower=n_data), 
                    interval='prediction', level=.99)

and plot the guy.

plot(fo, data=Auto, pch=20, cex=1, col="blue", 
     xlab=" car horsepower", ylab="acceleration secs to 100km/h", 
     main="Confidence intervals and prediction intervals", xlim=hp_rg)
abline(fit, lwd=2, col="red")
matlines(n_data, conf_int[, 2:3], lty='dashed', col="red", lwd=2)
matlines(n_data, pred_int[, 2:3], lty='dashed', col="green", lwd=2)
legend('topright', legend=c('conf_int', 'pred_int'), col=c("red", "green"),
       lty=2, lwd=2)

enter image description here

Note that I've used matlines here which is more concise, you could also use lines(n_data, conf_int[, 2], ..), lines(n_data, conf_int[, 3], ..) if you want.

like image 154
jay.sf Avatar answered Jan 26 '26 17:01

jay.sf