Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to handle boundary constraints when using `nls.lm` in R

Tags:

I asked this question a while ago. I am not sure whether I should post this as an answer or a new question. I do not have an answer but I "solved" the problem by applying the Levenberg-Marquardt algorithm using nls.lm in R and when the solution is at the boundary, I run the trust-region-reflective algorithm (TRR, implemented in R) to step away from it. Now I have new questions.

From my experience, doing this way the program reaches the optimal and is not so sensitive to the starting values. But this is only a practical method to step aside from the issues I encounterd using nls.lm and also other optimization functions in R. I would like to know why nls.lm behaves this way for optimization problems with boundary constraints and how to handle the boundary constraints when using nls.lm in practice.

Following I gave an example illustrating the two issues using nls.lm.

  1. It is sensitive to starting values.
  2. It stops when some parameter reaches the boundary.

A Reproducible Example: Focus Dataset D

library(devtools) install_github("KineticEval","zhenglei-gao") library(KineticEval) data(FOCUS2006D) km <- mkinmod.full(parent=list(type="SFO",M0 = list(ini   = 0.1,fixed = 0,lower = 0.0,upper =Inf),to="m1"),m1=list(type="SFO"),data=FOCUS2006D) system.time(Fit.TRR <- KinEval(km,evalMethod = 'NLLS',optimMethod = 'TRR')) system.time(Fit.LM <- KinEval(km,evalMethod = 'NLLS',optimMethod = 'LM',ctr=kingui.control(runTRR=FALSE))) compare_multi_kinmod(km,rbind(Fit.TRR$par,Fit.LM$par)) dev.print(jpeg,"LMvsTRR.jpeg",width=480) 

LM fit vs TRR fit

The differential equations that describes the model/system is:

"d_parent = - k_parent * parent"                                                  "d_m1 = - k_m1 * m1 + k_parent * f_parent_to_m1 * parent"  

In the graph on the left is the model with initial values, and in the middle is the fitted model using "TRR"(similar to the algorithm in Matlab lsqnonlin function ), on the right is the fitted model using "LM" with nls.lm. Looking at the fitted parameters(Fit.LM$par) you will find that one fitted parameter(f_parent_to_m1) is at the boundary 1. If I change the starting value for one parameter M0_parent from 0.1 to 100, then I got the same results using nls.lm and lsqnonlin.I have many cases like this one.

newpars <- rbind(Fit.TRR$par,Fit.LM$par) rownames(newpars)<- c("TRR(lsqnonlin)","LM(nls.lm)") newpars                     M0_parent   k_parent        k_m1 f_parent_to_m1 TRR(lsqnonlin)  99.59848 0.09869773 0.005260654       0.514476 LM(nls.lm)      84.79150 0.06352110 0.014783294       1.000000 

Except for the above problems, it often happens that the Hessian returned by nls.lm is not invertable(especially when some parameters are on the boundary) so I cannot get an estimation of the covariance matrix. On the other hand, the "TRR" algorithm(in Matlab) almost always give an estimation by calculating the Jacobian at the solution point. I think this is useful but I am also sure that R optimization algorithms(the ones I have tried) did not do this for a reason. I would like to know whether I am wrong by using the Matlab way of calculating the covariance matrix to get standard error for the parameter estimates.

One last note, I claimed in my previous post that the Matlab lsqnonlin outperforms R's optimization functions in almost all cases. I was wrong. The "Trust-Region-Reflective" algorithm used in Matlab is in fact slower(sometimes much slower) if also implemented in R as you can see from the above example. However, it is still more stable and reaches a better solution than the R's basic optimization algorithms.

like image 927
Zhenglei Avatar asked Jun 11 '13 09:06

Zhenglei


1 Answers

First off, I am not an expert on Matlab and Optimisation and have never used R.

I am not sure I see what your actual question is, but maybe I can shed some light into your puzzlement:

LM is slightly enhanced Gauß-Newton approach - for problems with several local minima it is very sensitive to initial states. Including boundaries typically generates more of those minima.

TRR is akin to LM, but more robust. It has better capabilities for "jumping out of" bad local minima. It is quite feasible that it will behave better, but perform worse, than an LM. Actually explaining why is very hard. You would need to study the algorithms in detail and look at how they behave in this situation.

I cannot explain the difference between Matlab's and R's implementation, but there are several extensions to TRR that maybe Matlab uses and R does not. Does your approach of using LM and TRR alternatingly converge better than TRR alone?

like image 156
Jörg Neulist Avatar answered Oct 25 '22 12:10

Jörg Neulist