Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating the number of dots lie above and below the regression line with R [closed]

How do I calculate the number of dots that lie above and below the regression line on a scatter plot?

data = read.csv("info.csv")
par(pty = "s")
plot(data$col1, data$col2, xlab = "xaxis", ylab = "yaxis", xlim = c(0, 
  1), cex.lab = 1.5, cex.axis = 1.5, ylim = c(0, 1), col.lab = "red", 
  col = "blue", pch = 19)
abline(a = -1.21, b = 2.21)
like image 711
user1731629 Avatar asked Oct 09 '12 11:10

user1731629


2 Answers

x <- 1:10
set.seed(1)
y <- 2*x+rnorm(10)

plot(y~x)

fit <- lm(y~x)
abline(fit)

resi <- resid(fit)
#below the fit:
sum(resi < 0)
#above the fit:
sum(resi > 0)

Edit: If you did (for some unknown reason) something like this:

x <- 1:10
set.seed(1)
y <- 2*x+rnorm(10)

plot(y~x)
abline(-0.17,2.05)

You can do this:

yfit <- 2.05 * x - 0.17
resi <- y - yfit

sum(resi < 0)
sum(resi > 0)
like image 133
Roland Avatar answered Sep 19 '22 13:09

Roland


If I've read the question properly, the answer would be.

  1. Determine the equation of the regression line - it is straight and there will be of the form y = mx +b where m is the slope of the line and b is the y intercept.
  2. Calculate the y value for each x in the domain of x.
  3. Using the value of y that you have in your data, determine whether it is greater, equal to or less than the calculated value of y

Using the above should be sufficient to find the numbers (counts) you are after.

like image 40
Big Bream Avatar answered Sep 19 '22 13:09

Big Bream