Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Limit/minimize step size in scipy optimization?

I'm using the following command (with scipy, inside python):

minimize(func, 0.35, method='L-BFGS-B, bounds=np.array([0.075, None]), options={'eps':0.01}) 

This does the following: Minimizes my function (func) by varying its one input parameter (the parameter is temperature, this is a chemistry simulation), with the initial guess 0.35, keeping temperature in the range [0.075, inf), taking the initial step size of 0.01 (in other words, the second point it tests is 0.36, after the initial 0.35).

This is all fine. The problem is that after a while, the step sizes get very tiny. The bfgs optimizer starts by taking step sizes of 0.01 but this quickly is tightened down to very small step sizes. At the end, sometimes it only changes temperature out to about the 8th or 9th decimal place. This is a problem because the function I am minimizing is not that sensitive. Basically, the temperature parameter is being passed to a computational chemistry simulation package. It uses a bit of random number seeding, and inside of each interation of bfgs are probably quadrillions of FLOPs inside the chemistry simulation, which mostly runs things in c++ double precision. So out to 8 or 9 decimal places, there is a lot of noise effecting the energy (energy is the output of the function, which I am trying to minimize, by varying temperature), and the random number seeding effects it to a small amount as well.

So what I want to do is tell the scipy optimizer that it cannot take steps smaller than, for example, 1e-4. But I can't seem to find a way to do this. I want to stick with the L-BFGS-B method, if possible. I have looked through some of the documentation but the only thing I've found so far is how to choose the INITIAL step size with the 'eps' option.

like image 808
iammax Avatar asked Sep 07 '17 17:09

iammax


2 Answers

I am a bit late to the party, but I would just like to share my workaround when I was faced with similar problem. It seems that the initial step of the optimizer is relative to the initial guess of the variable that is being optimized (x0 argument). In my case I needed to optimize an angle. When my initial guess of the angle was close to zero degrees, the algorithm took really small steps (fractions of degrees), which was less then sensitivity of my function. This resulted in the unsuccessful search for the right solution. I was able to fix the problem by adding 360 degrees to the original initial guess of the angle. This forced the minimizing algorithm to take bigger steps at the beginning and converge to the right value.

You could try to do something similar by adding a constant bias to your model before the optimization and subtracting it later. It's not the most elegant solution, but it helped in my case.

like image 84
Matej Jeglič Avatar answered Nov 10 '22 12:11

Matej Jeglič


Your interpretation is wrong: eps is not controlling the step-size and 0.36 is not necessarily the second point visited (when eps=0.01).

eps is solely used for numerical-differentiation by finite-differences when you don't give a gradient!

There is no step-size to tune in L-BFGS-B as it's using line-searches for approximating the optimal step-size (with some safeguards as needed by the underlying theory).

When L-BFGS-B is doing these tiny steps at a later stage, then it has it's reason. I'm pretty sure, the step-size of 1 is checked in every iterations as first-value (as we usually wan't to do big steps).

That being said, it seems your problem is somewhere else, which is hard to guess because we don't have all the details. But reading your tiny explanation about what you are doing, i would be very scared: The combination of L-BFGS-B together with a noisy-function (PRNG) and numerical-differentiation will be pretty unstable. It might be even worse as we are also approximating some inverse-hessian internally. This really sounds like the wrong method!

(I ignored user2357112's comment here as you say it's a real multivariate task in your case. Otherwise yes, use specifically designed methods!)

like image 36
sascha Avatar answered Nov 10 '22 11:11

sascha