Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Linear regression confidence intervals in SQL

I'm using some fairly straight-forward SQL code to calculate the coefficients of regression (intercept and slope) of some (x,y) data points, using least-squares. This gives me a nice best-fit line through the data. However we would like to be able to see the 95% and 5% confidence intervals for the line of best-fit (the curves below).

link text
(source: curvefit.com)

What these mean is that the true line has 95% probability of being below the upper curve and 95% probability of being above the lower curve. How can I calculate these curves? I have already read wikipedia etc. and done some googling but I haven't found understandable mathematical equations to be able to calculate this.

Edit: here is the essence of what I have right now.

--sample data
create table #lr (x real not null, y real not null)
insert into #lr values (0,1)
insert into #lr values (4,9)
insert into #lr values (2,5)
insert into #lr values (3,7)

declare @slope real
declare @intercept real

--calculate slope and intercept
select 
@slope = ((count(*) * sum(x*y)) - (sum(x)*sum(y)))/
((count(*) * sum(Power(x,2)))-Power(Sum(x),2)),
@intercept = avg(y) - ((count(*) * sum(x*y)) - (sum(x)*sum(y)))/
((count(*) * sum(Power(x,2)))-Power(Sum(x),2)) * avg(x)
from #lr

Thank you in advance.

like image 979
Matt Howells Avatar asked Jul 23 '09 12:07

Matt Howells


People also ask

How do you create a confidence interval in SQL?

Given the z-score needed to reach a certain confidence level (1.96 for a 95% confidence), add 0.5 * z^2 to the number of conversions, and z^2 to the number of users. This is roughly +2 and +4 for the 1.96 z-score for 95%.

What is the 95% confidence interval for the regression parameter?

A 95% confidence interval for βi has two equivalent definitions: The interval is the set of values for which a hypothesis test to the level of 5% cannot be rejected. The interval has a probability of 95% to contain the true value of βi .

How do you calculate 95 confidence interval in regression?

We can use the following formula to calculate a 95% confidence interval for the intercept: 95% C.I. for β0: b0 ± tα/2,n-2 * se(b0) 95% C.I. for β0: 65.334 ± t.05/2,15-2 * 2.106.


1 Answers

An equation for confidence interval width as f(x) is given here under "Confidence Interval on Fitted Values"

http://www.weibull.com/DOEWeb/confidence_intervals_in_simple_linear_regression.htm

The page walks you through an example calculation too.

like image 80
nsanders Avatar answered Oct 07 '22 10:10

nsanders