I wonder if there is a way to calculate the distance between a abline in a plot and a datapoint? For example, what is the distance between concentration == 40
with signal == 643
(element 5) and the abline?
concentration <- c(1,10,20,30,40,50)
signal <- c(4, 22, 44, 244, 643, 1102)
plot(concentration, signal)
res <- lm(signal ~ concentration)
abline(res)
The standard error of estimate (SEE) provides a measure of how accurately the regression equation predicts the Y values. For example, SEE of 2.16 would tell us that the standard or average distance between the actual data points and the regression line is 2.16 units.
residual. → The value of a residual tells us the vertical distance between a data point and the regression line. The general form for a linear equation is given as: y = a + bx. a represents the y-______.
The distance is then √2 × d + s.
You are basically asking for the residuals
.
R> residuals(res)
1 2 3 4 5 6
192.61 12.57 -185.48 -205.52 -26.57 212.39
As an aside, when you fit a linear regression, the sum of the residuals is 0:
R> sum(residuals(res))
[1] 8.882e-15
and if the model is correct, should follow a Normal distribution - qqnorm(res)
.
I find working with the standardised residuals easier.
> rstandard(res)
1 2 3 4 5 6
1.37707 0.07527 -1.02653 -1.13610 -0.15845 1.54918
These residuals have been scaled to have mean zero, variance (approximately) equal to one and have a Normal distribution. Outlying standardised residuals are those larger that +/- 2.
You can use the function below:
http://paulbourke.net/geometry/pointlineplane/pointline.r
Then just extract the slope and intercept:
> coef(res)
(Intercept) concentration
-210.61098 22.00441
So your final answer would be:
concentration <- c(1,10,20,30,40,50)
signal <- c(4, 22, 44, 244, 643, 1102)
plot(concentration, signal)
res <- lm(signal ~ concentration)
abline(res)
cfs <- coef(res)
distancePointLine(y=signal[5], x=concentration[5], slope=cfs[2], intercept=cfs[1])
If you want a more general solution to finding a particular point, concentration == 40
returns a Boolean vector of length length(concentration)
. You can use that vector to select points.
pt.sel <- ( concentration == 40 )
> pt.sel
[1] FALSE FALSE FALSE FALSE TRUE FALSE
> distancePointLine(y=signal[pt.sel], x=concentration[pt.sel], slope=cfs["concentration"], intercept=cfs["(Intercept)"])
1.206032
Unfortunately distancePointLine doesn't appear to be vectorized (or it does, but it returns a warning when you pass it a vector). Otherwise you could get answers for all points just by leaving the [] selector off the x and y arguments.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With