Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Confidence interval of polynomial regression

I have a little issue with R and statistics.

I fitted a model with the Maximum Likelihood method, who gave me the following coefficients with their respective Standard Errors (among other parameters estimates):

    ParamIndex   Estimate     SE        
1         a0  0.2135187 0.02990105  
2         a1  1.1343072 0.26123775  
3         a2 -1.0000000 0.25552696  

From what I can draw my curve:

 y= 0.2135187 + 1.1343072 * x - 1 * I(x^2)

But from that, I have now to calculate the confidence interval around this curve, and I don't have a clear idea how to do that.

Apparently, I should use the propagation or error/uncertainty, but the methods I found require the raw data, or more than just the polynomial formula.

Is there any method to calculate the CI of my curve when the SE of the estimates are known with R?

Thank you for your help.


Edit:

So, right now, I have the covariance table (v) obtain with the function vcov:

                 a0           a1           a2
    a0  0.000894073 -0.003622614  0.002874075
    a1 -0.003622614  0.068245163 -0.065114661
    a2  0.002874075 -0.065114661  0.065294027

and n = 279.

like image 533
trantsyx Avatar asked Jan 18 '17 15:01

trantsyx


1 Answers

You don't have enough information right now. To compute confidence interval of your fitted curve, a complete variance-covariance matrix for your three coefficients is required, but right now you only have diagonal entries of that matrix.

If you have fitted an orthogonal polynomial, then variance-covariance matrix is diagonal, with identical diagonal elements. This is certainly not your case, as:

  • standard errors you show are different from each other;
  • you have explicitly used raw polynomial notation: x + I(x ^ 2)

but the methods I found require the raw data

It's not "raw data" used for fitting the model. It is "new data" where you want to produce the confidence band. However, you do need to know the number of data used for fitting the model, say n, as that is necessary to derive residual degree of freedom. In your case with 3 coefficients, this degree of freedom is n - 3.

Once you have:

  • the full variance-covariance matrix, let's say V;
  • n, the number of data used for model fitting;
  • a vector of points x giving where to produce confidence band,

you can first get prediction standard error from:

X <- cbind(1, x, x ^ 2)    ## prediction matrix
e <- sqrt( rowSums(X * (X %*% V)) )    ## prediction standard error

You know how to get predicted mean, from your fitted polynomial formula, right? Suppose the mean is mu, now for 95%-CI, use

## residual degree of freedom: n - 3
mu + e * qt(0.025, n - 3)  ## lower bound
mu - e * qt(0.025, n - 3)  ## upper bound

A complete theory is at How does predict.lm() compute confidence interval and prediction interval?


Update

Based on your provided covariance matrix, it is now possible to produce some result and figures.

V <- structure(c(0.000894073, -0.003622614, 0.002874075, -0.003622614, 
0.068245163, -0.065114661, 0.002874075, -0.065114661, 0.065294027
), .Dim = c(3L, 3L), .Dimnames = list(c("a0", "a1", "a2"), c("a0", 
"a1", "a2")))

Suppose we want to produce CI at x = seq(-5, 5, by = 0.2):

beta <- c(0.2135187, 1.1343072, -1.0000000)
x <- seq(-5, 5, by = 0.2)
X <- cbind(1, x, x ^ 2)
mu <- X %*% beta
e <- sqrt( rowSums(X * (X %*% V)) )
n <- 279
lo <- mu + e * qt(0.025, n - 3)
up <- mu - e * qt(0.025, n - 3)
matplot(x, cbind(mu, lo, up), type = "l", col = 1, lty = c(1,2,2))

enter image description here

like image 181
Zheyuan Li Avatar answered Sep 19 '22 12:09

Zheyuan Li