Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using R to fit a Sigmoidal Curve

I have read a post ( Sigmoidal Curve Fit in R ). It was labeled duplicated, but I can't see anything related with the posts. And the answer given for the posts was not enough.

I read a webpage

Similar to the others, he uses this format to fit the line:

fitmodel <- nls(y~a/(1 + exp(-b * (x-c))), start=list(a=1,b=.5,c=25))

The problem is that, a,b,c were given in most of the cases and I have no clues which set of a,b,c should I use for my set of data. Could someone give me some advice on how to obtain the parameters?

Here is my set of numbers :

x <- c(3.9637878,3.486667,3.0095444,2.5324231,2.0553019,1.5781806,1.1010594,0.6242821)
y <- c(6491.314,6190.092,2664.021,2686.414,724.707,791.243,1809.586,541.243)
like image 882
FunnyFunkyBuggy Avatar asked Oct 09 '15 08:10

FunnyFunkyBuggy


3 Answers

Luckily R offers a selfstarting model for the logistic model. It uses a slight reparametrization, but is really the same model as yours: Asym/(1+exp((xmid-input)/scal))

A selfstarting model can estimate good starting values for you, so you don't have to specify them.

plot(y ~ x)
fit <- nls(y ~ SSlogis(x, Asym, xmid, scal), data = data.frame(x, y))

summary(fit)
#Formula: y ~ SSlogis(x, Asym, xmid, scal)
#
#Parameters:
#      Estimate Std. Error t value Pr(>|t|)
#Asym 1.473e+04  2.309e+04   0.638    0.551
#xmid 4.094e+00  2.739e+00   1.495    0.195
#scal 9.487e-01  5.851e-01   1.622    0.166
#
#Residual standard error: 941.9 on 5 degrees of freedom
#
#Number of iterations to convergence: 0 
#Achieved convergence tolerance: 4.928e-06

lines(seq(0.5, 4, length.out = 100), 
      predict(fit, newdata = data.frame(x = seq(0.5, 4, length.out = 100))))

resulting plot

Of course your data doesn't really support the model. The estimated mid point is just at the right limit of your data range and thus the parameter estimates (in particular for the asymptote) are very uncertain.

like image 103
Roland Avatar answered Dec 07 '22 22:12

Roland


The code I used to fit your data:

 df <- data.frame(x=c(3.9637878,3.486667,3.0095444,2.5324231,2.0553019,1.5781806,1.1010594,0.6242821),                     
                  y=c(6491.314,6190.092,2664.021,2686.414,724.707,791.243,1809.586,541.243))

library(drc)
fm <- drm(y ~ x, data = df, fct = G.3())

plot(fm)
summary(fm)

The way it looks after fitting: enter image description here

like image 23
numb Avatar answered Dec 07 '22 22:12

numb


I see two issues.

  1. the default algorithm of nls is very sensitive to the starting parameter. In your example data I found it useful to use algorithm='port'. Alternatively switching to a "robust" implementation might also help.

  2. It helps understanding the role of the parameter in your model.

The simple interpretation for your model is: The sigmoid goes in y from 0 to a. It reaches the "half way" point at x=c. b has the role of a slope, and if negative the model would go from a to 0 instead.

Specifically to the test data posted by you I would estimate the start values as following:

  • First thing I notice - your data is not exactly 'close' to zero so maybe it might be useful to add an offset d which is around 1000.
  • a is then 5000 or greater
  • c is somewhere greater 2 - maybe 3
  • b one needs to guess - from x 2 to 3.5 your signal jumps from 1000 to 6000 gives 5000 difference - divided by a - a slope of 1/1.5 = 0.66 or greater... lets round to one.

So ultimately using the formula

fitmodel <- nls(y ~a/(1 + exp(-b * (x-c)) ) + d, start=list(a=5000,b=1,c=3, d=1000))

gives a fit (also works without the d). Trying around I found setting algorithm='port' made the command even less sensitive to the start values.

like image 42
bdecaf Avatar answered Dec 07 '22 20:12

bdecaf