Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating R^2 for a nonlinear least squares fit

Tags:

r

Suppose I have x values, y values, and expected y values f (from some nonlinear best fit curve).

How can I compute R^2 in R? Note that this function is not a linear model, but a nonlinear least squares (nls) fit, so not an lm fit.

like image 498
CodeGuy Avatar asked Jan 25 '13 21:01

CodeGuy


People also ask

Can you use R2 for nonlinear models?

It is long known within the mathematical literature that the coefficient of determination R2 is an inadequate measure for the goodness of fit in nonlinear models. Nevertheless, it is still frequently used within pharmacological and biochemical literature for the analysis and interpretation of nonlinear fitting to data.

Can you use R2 for nonlinear regression?

Nonlinear regression is an extremely flexible analysis that can fit most any curve that is present in your data. R-squared seems like a very intuitive way to assess the goodness-of-fit for a regression model. Unfortunately, the two just don't go together.

Why does R 2 not work for nonlinear regression?

Minitab doesn't calculate R-squared for nonlinear models because the research literature shows that it is an invalid goodness-of-fit statistic for this type of model. There are bad consequences if you use it in this context.

What does R-squared mean in nonlinear regression?

The value R2quantifies goodness of fit. It is a fraction between 0.0 and 1.0, and has no units. Higher values indicate that the model fits the data better.


2 Answers

You just use the lm function to fit a linear model:

x = runif(100)
y = runif(100)
spam = summary(lm(x~y))
> spam$r.squared
[1] 0.0008532386

Note that the r squared is not defined for non-linear models, or at least very tricky, quote from R-help:

There is a good reason that an nls model fit in R does not provide r-squared - r-squared doesn't make sense for a general nls model.

One way of thinking of r-squared is as a comparison of the residual sum of squares for the fitted model to the residual sum of squares for a trivial model that consists of a constant only. You cannot guarantee that this is a comparison of nested models when dealing with an nls model. If the models aren't nested this comparison is not terribly meaningful.

So the answer is that you probably don't want to do this in the first place.

If you want peer-reviewed evidence, see this article for example; it's not that you can't compute the R^2 value, it's just that it may not mean the same thing/have the same desirable properties as in the linear-model case.

like image 69
Paul Hiemstra Avatar answered Sep 28 '22 14:09

Paul Hiemstra


Sounds like f are your predicted values. So the distance from them to the actual values devided by n * variance of y

so something like

1-sum((y-f)^2)/(length(y)*var(y))

should give you a quasi rsquared value, so long as your model is reasonably close to a linear model and n is pretty big.

like image 30
Seth Avatar answered Sep 28 '22 13:09

Seth