Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R nls singular gradient

Tags:

r

regression

nls

I've tried searching the other threads on this topic but none of the fixes are working for me. I have the results of a natural experiment and I want to show the number of consecutive occurrences of an event fit an exponential distribution. My R shell is pasted below

f <- function(x,a,b) {a * exp(b * x)}
> x
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
[26] 26 27
> y
 [1] 1880  813  376  161  100   61   31    9    8    2    7    4    3    2    0
[16]    1    0    0    0    0    0    1    0    0    0    0    1
> dat2
    x    y
1   1 1880
2   2  813
3   3  376
4   4  161
5   5  100
6   6   61
7   7   31
8   8    9
9   9    8
10 10    2
11 11    7
12 12    4
13 13    3
14 14    2
> fm <- nls(y ~ f(x,a,b), data = dat2, start = c(a=1, b=1)) 
Error in numericDeriv(form[[3L]], names(ind), env) : 
  Missing value or an infinity produced when evaluating the model
> fm <- nls(y ~ f(x,a,b), data = dat2, start = c(a=7, b=-.5)) 
Error in nls(y ~ f(x, a, b), data = dat2, start = c(a = 7, b = -0.5)) : 
  singular gradient
> fm <- nls(y ~ f(x,a,b), data = dat2, start = c(a=7,b=-.5),control=nls.control(maxiter=1000,warnOnly=TRUE,minFactor=1e-5,tol=1e-10),trace=TRUE) 
4355798 :   7.0 -0.5
Warning message:
In nls(y ~ f(x, a, b), data = dat2, start = c(a = 7, b = -0.5),  :
  singular gradient

Please forgive the bad formatting, first post here. x contains bins of a histogram, y contains the number of occurrences of each bin in that histograms. dat2 cuts off at 14 since the 0 count bins would throw off the exponential regression, and I really only need to fit those first 14. Those bins which have counts beyond 14 I have biological reason to believe they are special. The issue I initially got was infinity, which I don't get since none of the values are 0. After giving decent starting values as suggested by a different post here I get the singular gradient error. The only other posts I saw with that had more variables, I tried increasing the number of iterations but that did not succeed. Any help is appreciated. A

like image 376
sessmurda Avatar asked Aug 21 '13 17:08

sessmurda


People also ask

What does singular gradient mean in R?

When it fails to have full column rank the "singular gradient" message is given and the iterations stop. Generally this indicates that the model is overparameterized or that the starting estimates were poorly chosen. Try using trace = TRUE in the call to nls and watching the progress of the iterations.

What is NLS R?

The most basic way to estimate such parameters is to use a non-linear least squares approach (function nls in R) which basically approximate the non-linear function using a linear one and iteratively try to find the best parameter values (wiki).


1 Answers

1) linearize to get starting values You need better starting values:

# starting values
fm0 <- nls(log(y) ~ log(f(x, a, b)), dat2, start = c(a = 1, b = 1))

nls(y ~ f(x, a, b), dat2, start = coef(fm0))

giving:

Nonlinear regression model
  model: y ~ f(x, a, b)
   data: x
        a         b 
4214.4228   -0.8106 
 residual sum-of-squares: 2388

Number of iterations to convergence: 6 
Achieved convergence tolerance: 3.363e-06

1a) Similarly we could use lm to get the initial value by writing

y ~ a * exp(b * x)

as

y ~ exp(log(a) + b * x)

and taking logs of both to get a model linear in log(a) and b:

log(y) ~ log(a) + b * x

which can be solved using lm:

fm_lm <- lm(log(y) ~ x, dat2)
st <- list(a = exp(coef(fm_lm)[1]), b = coef(fm_lm)[2])
nls(y ~ f(x, a, b), dat2, start = st)

giving:

Nonlinear regression model
  model: y ~ f(x, a, b)
   data: dat2
       a        b 
4214.423   -0.811 
 residual sum-of-squares: 2388

Number of iterations to convergence: 6 
Achieved convergence tolerance: 3.36e-06

1b) We can also get it to work by reparameterizing. In that case a = 1 and b = 1 will work provided we transform the initial values in line with the parameter transformation.

nls(y ~ exp(loga + b * x), dat2, start = list(loga = log(1), b = 1))

giving:

Nonlinear regression model
  model: y ~ exp(loga + b * x)
   data: dat2
  loga      b 
 8.346 -0.811 
 residual sum-of-squares: 2388

Number of iterations to convergence: 20 
Achieved convergence tolerance: 3.82e-07

so b is as shown and a = exp(loga) = exp(8.346) = 4213.3

2) plinear Another possibility that is even easier is to use alg="plinear" in which case starting values are not needed for the parameters entering linearly. In that case the starting value of b=1 in the question seems sufficient.

nls(y ~ exp(b * x), dat2, start = c(b = 1), alg = "plinear")

giving:

Nonlinear regression model
  model: y ~ exp(b * x)
   data: dat2
        b      .lin 
  -0.8106 4214.4234 
 residual sum-of-squares: 2388

Number of iterations to convergence: 11 
Achieved convergence tolerance: 2.153e-06
like image 110
G. Grothendieck Avatar answered Oct 04 '22 09:10

G. Grothendieck