Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Negative exponential fit: curve looks too high

I am trying to fit a negative exponential to some data in R, but the fitted line looks too high compared to the data, whereas the fit I get using Excel's built-in power fit looks more believable. Can someone tell me why? I've tried using the nls() function and also optim() and get similar parameters from both of those methods, but the fits for both look high.

   x    <- c(5.96, 12.86, 8.40, 2.03, 12.84, 21.44, 21.45, 19.97, 8.92, 25.00, 19.90, 20.00, 20.70, 16.68, 14.90, 26.00, 22.00, 22.00, 10.00, 5.70, 5.40, 3.20, 7.60, 0.59, 0.14, 0.85, 9.20, 0.79, 1.40, 2.68, 1.91)
   y    <- c(5.35, 2.38, 1.77, 1.87, 1.47, 3.27, 2.01, 0.52, 2.72, 0.85, 1.60, 1.37, 1.48, 0.39, 2.39, 1.83, 0.71, 1.24, 3.14, 2.16, 2.22, 11.50, 8.32, 38.98, 16.78, 32.66, 3.89, 1.89, 8.71, 9.74, 23.14)

    xy.frame <- data.frame(x,y)

    nl.fit <- nls(formula=(y ~ a * x^b), data=xy.frame, start = c(a=10, b=-0.7))

    a.est <- coef(nl.fit)[1]
    b.est <- coef(nl.fit)[2]

    plot(x=xy.frame$x,y=xy.frame$y)

    # curve looks too high
    curve(a.est * x^b.est , add=T)
    # these parameters from Excel seem to fit better
    curve(10.495 * x^-0.655, add=T)

enter image description here

    # alternatively use optim()
    theta.init <- c(1000,-0.5, 50)

    exp.nll <- function(theta, data){
      a <- theta[1]
      b <- theta[2]
      sigma <- theta[3]
      obs.y <- data$y
      x <- data$x
      pred.y <- a*x^b
      nll <- -sum(dnorm(x=obs.y, mean=pred.y , sd=sigma, log=T))
      nll
    }

    fit.optim <- optim(par=theta.init,fn=exp.nll,method="BFGS",data=xy.frame )

    plot(x=xy.frame$x,y=xy.frame$y)

    # still looks too high
    curve(a.est * x^b.est, add=T)

enter image description here

like image 250
ginko Avatar asked Oct 20 '15 21:10

ginko


People also ask

What is a negative exponential curve?

The negative exponential function is y=e−x y = e − x . As the the value for x becomes larger, e−x approaches zero. We have exponential decay when dydx∝−y d y d x ∝ − y .

How many points do you need to fit an exponential curve?

The rule of thumb in applied sciences and engineering is you need a minimum 3 points over 3 orders of magnitude for a curve fit. 5 over 5 is better. This is what we use in reliability physics, which involves fitting exponential curves.

What is exponential regression?

What is Exponential Regression? It is a model that explains processes that experiences growth at a double rate. It is used for situations where the growth begins slowly but rapidly speeds up without bounds or where the decay starts rapidly but slows down to get to zero.


1 Answers

The reason you're seeing the unexpected behavior is that the curves that look "too high" actually have much lower sums of squared errors than the curves from excel:

# Fit from nls
sum((y - a.est*x^b.est)^2) 
# [1] 1588.313

# Fit from excel
sum((y - 10.495*x^ -0.655)^2)
# [1] 1981.561

The reason nls favors the higher curve is that it is working to avoid huge errors at small x values at the cost of slightly larger errors with large x values. One way to address this might be to apply a log-log transformation:

mod <- lm(log(y)~log(x))
(a.est2 <- exp(coef(mod)["(Intercept)"]))
# (Intercept) 
#    10.45614 
(b.est2 <- coef(mod)["log(x)"])
#     log(x) 
# -0.6529741 

These are quite close to the coefficients from excel, and yield a more visually appealing fit (despite the worse performance on the sum-of-squared-errors metric):

enter image description here

like image 195
josliber Avatar answered Sep 30 '22 04:09

josliber