I've tried searching the other threads on this topic but none of the fixes are working for me. I have the results of a natural experiment and I want to show the number of consecutive occurrences of an event fit an exponential distribution. My R shell is pasted below
f <- function(x,a,b) {a * exp(b * x)}
> x
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
[26] 26 27
> y
[1] 1880 813 376 161 100 61 31 9 8 2 7 4 3 2 0
[16] 1 0 0 0 0 0 1 0 0 0 0 1
> dat2
x y
1 1 1880
2 2 813
3 3 376
4 4 161
5 5 100
6 6 61
7 7 31
8 8 9
9 9 8
10 10 2
11 11 7
12 12 4
13 13 3
14 14 2
> fm <- nls(y ~ f(x,a,b), data = dat2, start = c(a=1, b=1))
Error in numericDeriv(form[[3L]], names(ind), env) :
Missing value or an infinity produced when evaluating the model
> fm <- nls(y ~ f(x,a,b), data = dat2, start = c(a=7, b=-.5))
Error in nls(y ~ f(x, a, b), data = dat2, start = c(a = 7, b = -0.5)) :
singular gradient
> fm <- nls(y ~ f(x,a,b), data = dat2, start = c(a=7,b=-.5),control=nls.control(maxiter=1000,warnOnly=TRUE,minFactor=1e-5,tol=1e-10),trace=TRUE)
4355798 : 7.0 -0.5
Warning message:
In nls(y ~ f(x, a, b), data = dat2, start = c(a = 7, b = -0.5), :
singular gradient
Please forgive the bad formatting, first post here. x contains bins of a histogram, y contains the number of occurrences of each bin in that histograms. dat2 cuts off at 14 since the 0 count bins would throw off the exponential regression, and I really only need to fit those first 14. Those bins which have counts beyond 14 I have biological reason to believe they are special. The issue I initially got was infinity, which I don't get since none of the values are 0. After giving decent starting values as suggested by a different post here I get the singular gradient error. The only other posts I saw with that had more variables, I tried increasing the number of iterations but that did not succeed. Any help is appreciated. A
When it fails to have full column rank the "singular gradient" message is given and the iterations stop. Generally this indicates that the model is overparameterized or that the starting estimates were poorly chosen. Try using trace = TRUE in the call to nls and watching the progress of the iterations.
The most basic way to estimate such parameters is to use a non-linear least squares approach (function nls in R) which basically approximate the non-linear function using a linear one and iteratively try to find the best parameter values (wiki).
1) linearize to get starting values You need better starting values:
# starting values
fm0 <- nls(log(y) ~ log(f(x, a, b)), dat2, start = c(a = 1, b = 1))
nls(y ~ f(x, a, b), dat2, start = coef(fm0))
giving:
Nonlinear regression model
model: y ~ f(x, a, b)
data: x
a b
4214.4228 -0.8106
residual sum-of-squares: 2388
Number of iterations to convergence: 6
Achieved convergence tolerance: 3.363e-06
1a) Similarly we could use lm
to get the initial value by writing
y ~ a * exp(b * x)
as
y ~ exp(log(a) + b * x)
and taking logs of both to get a model linear in log(a) and b:
log(y) ~ log(a) + b * x
which can be solved using lm
:
fm_lm <- lm(log(y) ~ x, dat2)
st <- list(a = exp(coef(fm_lm)[1]), b = coef(fm_lm)[2])
nls(y ~ f(x, a, b), dat2, start = st)
giving:
Nonlinear regression model
model: y ~ f(x, a, b)
data: dat2
a b
4214.423 -0.811
residual sum-of-squares: 2388
Number of iterations to convergence: 6
Achieved convergence tolerance: 3.36e-06
1b) We can also get it to work by reparameterizing. In that case a = 1 and b = 1 will work provided we transform the initial values in line with the parameter transformation.
nls(y ~ exp(loga + b * x), dat2, start = list(loga = log(1), b = 1))
giving:
Nonlinear regression model
model: y ~ exp(loga + b * x)
data: dat2
loga b
8.346 -0.811
residual sum-of-squares: 2388
Number of iterations to convergence: 20
Achieved convergence tolerance: 3.82e-07
so b is as shown and a = exp(loga) = exp(8.346) = 4213.3
2) plinear Another possibility that is even easier is to use alg="plinear"
in which case starting values are not needed for the parameters entering linearly. In that case the starting value of b=1
in the question seems sufficient.
nls(y ~ exp(b * x), dat2, start = c(b = 1), alg = "plinear")
giving:
Nonlinear regression model
model: y ~ exp(b * x)
data: dat2
b .lin
-0.8106 4214.4234
residual sum-of-squares: 2388
Number of iterations to convergence: 11
Achieved convergence tolerance: 2.153e-06
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With