Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SLR - simple linear regression (in R, but about the math behind, not the programming)

So I have some problems understanding simple linear regression. I did read a lot, so I have the basic ideas in mind, but I cannot quite follow when we do one. So I have this equation:

yi = a + bxi + ei

Okay so I do realize this is the equation for a straight line, even though I do wonder about the "ei" as I cannot find it on the internet, but my professor keeps using it.

So I want to find a and b, so I can find a straight line which I hope isn't to far away from my data (is that right?). I know I can calculate that, but this is not my question.

I hope it is alright if I add my example here, so I can explain what I'm doing: data set

x        y
8        6.4
8        6.8
3        1.7
2        2.3
2        3.8
1        2.3
1        5.0
1        4.0
1        3.4
0        2.3

calculation everything that's needed, I get: b = 0.4599 a = 2.55827

(and doing the lm with R shows me it is right). Now if I draw this straight line abline(2.55827,0.4599) (entering the intercept first??), it shows me that this is just not a good line and looking at the table I would totally agree. But do I understand right? If the x|y points arrange like they do through the given values (meaning without a specific pattern), there's just no good line to find, so I can only find a rather good one.

Can someone maybe help me out here?

like image 842
lisa Avatar asked Nov 22 '25 03:11

lisa


1 Answers

Okay so I do realize this is the equation for a straight line, even though I do wonder about the "ei" as I cannot find it on the internet, but my professor keeps using it.

It's not the equation for a line. yi = a + bxi is the equation for a line. That ei is the error between this straight line given by a and b and your measurements. In other words, ei = yi - (a + bxi).

What linear regression does is to find the values for a and b that minimizes the sum of the squares of those error terms. This fit is not necessarily a good one; it's just the best possible (in a least squares sense). The size of the residual gives you an idea of how good the fit was.

To be able to make sense of whether the fit is good or bad, you need to know not just the residuals but also the errors in the individual measurements.

like image 145
David Hammen Avatar answered Nov 24 '25 15:11

David Hammen



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!