Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to run gradient descent algorithm when parameter space is constrained?

I would like to maximize a function with one parameter.

So I run gradient descent (or, ascent actually): I start with an initial parameter and keep adding the gradient (times some learning rate factor that gets smaller and smaller), re-evaluate the gradient given the new parameter, and so on until convergence.

But there is one problem: My parameter must stay positive, so it is not supposed to become <= 0 because my function will be undefined. My gradient search will sometimes go into such regions though (when it was positive, the gradient told it to go a bit lower, and it overshoots).

And to make things worse, the gradient at such a point might be negative, driving the search toward even more negative parameter values. (The reason is that the objective function contains logs, but the gradient doesn't.)

What are some good (simple) algorithms that deal with this constrained optimization problem? I'm hoping for just a simple fix to my algorithm. Or maybe ignore the gradient and do some kind of line search for the optimal parameter?

like image 282
Erin Avatar asked Jun 29 '10 02:06

Erin


2 Answers

  1. Each time you update your parameter, check to see if it's negative, and if it is, clamp it to zero.
  2. If clamping to zero is not acceptable, try adding a "log-barrier" (Google it). Basically, it adds a smooth "soft" wall to your objective function (and modifying your gradient) to keep it away from regions you don't want it to go to. You then repeatedly run the optimization by hardening up the wall to make it more infinitely vertical, but starting with the previously found solution. In the limit (in practice only a few iterations are needed), the problem you are solving is identical to the original problem with a hard constraint.
like image 164
Victor Liu Avatar answered Oct 15 '22 17:10

Victor Liu


Without knowing more about your problem, it's hard to give specific advice. Your gradient ascent algorithm may not be particularly suitable for your function space. However, given that's what you've got, here's one tweak that would help.

You're following what you believe is an ascending gradient. But when you move forwards in the direction of the gradient, you discover you have fallen into a pit of negative value. This implies that there was a nearby local maximum, but also a very sharp negative gradient cliff. The obvious fix is to backtrack to your previous position, and take a smaller step (e.g half the size). If you still fall in, repeat with a still smaller step. This will iterate until you find the local maximum at the edge of the cliff.

The problem is, there is no guarantee that your local maximum is actually global (unless you know more about your function than you are sharing). This is the main limitation of naive gradient ascent - it always fixes on the first local maximum and converges to it. If you don't want to switch to a more robust algorithm, one simple approach that could help is to run n iterations of your code, starting each time from random positions in the function space, and keeping the best maximum you find overall. This Monte Carlo approach increases the odds that you will stumble on the global maximum, at the cost of increasing your run time by a factor n. How effective this is will depend on the 'bumpiness' of your objective function.

like image 25
ire_and_curses Avatar answered Oct 15 '22 16:10

ire_and_curses