Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use TensorFlow gradient descent optimizer to solve optimization problems

I'm trying to use TensorFlow's Gradient Descent Optimizer to solve 2-dimension Rosenbrock function, but as I ran the program, the optimizer sometimes goes towards the infinity. Also sometime, without changing anything, it can find the right neighborhood but not pinpoint the optimal solution.

My code is as follows:

import tensorflow as tf

x1_data = tf.Variable(initial_value=tf.random_uniform([1], -10, 10),name='x1')
x2_data = tf.Variable(initial_value=tf.random_uniform([1], -10, 10), name='x2')

# Loss function
y = tf.add(tf.pow(tf.sub(1.0, x1_data), 2.0), 
           tf.mul(100.0, tf.pow(tf.sub(x2_data,tf.pow(x1_data, 2.0)), 2.0)), 'y')

opt = tf.train.GradientDescentOptimizer(0.0035)
train = opt.minimize(y)

sess = tf.Session()

init = tf.initialize_all_variables()
sess.run(init)

for step in xrange(200):
    sess.run(train)
    if step % 10 == 0:
        print(step, sess.run(x1_data), sess.run(x2_data), sess.run(y))

The Rosenbrock problem is defined as y = (1 - x1)^2 + 100 * (x2 - x1^2)^2, giving the optimal solution on x1 = x2 = 1

What I'm doing wrong with this? Or have I completely misunderstood how to use TensorFlow?

like image 692
K. Lindholm Avatar asked Jun 28 '16 05:06

K. Lindholm


People also ask

What is the purpose of the optimizer in TensorFlow?

An optimizer is an algorithm used to minimize a loss function with respect to a model's trainable parameters. The most straightforward optimization technique is gradient descent, which iteratively updates a model's parameters by taking a step in the direction of its loss function's steepest descent.

Which Optimizer is best for regression?

Gradient Descent is the most basic but most used optimization algorithm. It's used heavily in linear regression and classification algorithms. Backpropagation in neural networks also uses a gradient descent algorithm.

What does optimizer usually do in gradient descent in linear regression?

Gradient descent optimizer is an optimization algorithm that is basically used so as to reduce some functions by repetitively moving in the direction of descent that is steepest as explained by the gradient's negative.


2 Answers

If you decrease the variation of initial x1/x2 (e.g. use -3/3 instead of -10/10) and decrease the learning rate by a factor of 10, it shouldn't blow up as often. Decreasing learning rate when you see things diverging is often a good thing to try.

Also, the function you're optimizing is made for being difficult to find the global minimum, so no surprises there that it finds the valley but not the global optimum ;)

like image 131
etarion Avatar answered Nov 14 '22 23:11

etarion


Yes, like @etarion says this is an optimization problem, your TensorFlow code is fine.

One way to make sure the gradients never explode is to clip them in the range [-10., 10.] for instance:

opt = tf.train.GradientDescentOptimizer(0.0001)
grads_and_vars = opt.compute_gradients(y, [x1_data, x2_data])
clipped_grads_and_vars = [(tf.clip_by_value(g, -10., 10.), v) for g, v in grads_and_vars]

train = opt.apply_gradients(clipped_grads_and_vars)
like image 42
Olivier Moindrot Avatar answered Nov 14 '22 23:11

Olivier Moindrot