Suppose we have weights <pre class="prettyprint"><code>x = tf.Variable(np.random.random((5,10))) cost = ... </code></pre> And we use the GD optimizer: <pre class="prettyprint"><code>upds = tf.train.GradientDescentOptimizer(lr).minimize(cost) session.run(upds) </code></pre> How can we implement for example non-negativity on weights? I tried clipping them: <pre class="prettyprint"><code>upds = tf.train.GradientDescentOptimizer(lr).minimize(cost) session.run(upds) session.run(tf.assign(x, tf.clip_by_value(x, 0, np.infty))) </code></pre> But this slows down my training by a factor of 50. Does anybody know a good way to implement such constraints on the weights in TensorFlow? P.S.: in the equivalent Theano algorithm, I had <pre class="prettyprint"><code>T.clip(x, 0, np.infty) </code></pre> and it ran smoothly.

You can take the Lagrangian approach and simply add a penalty for features of the variable you don't want. e.g. To encourage <code>theta</code> to be non-negative, you could add the following to the optimizer's objective function. <pre class="prettyprint"><code> added_loss = -tf.minimum( tf.reduce_min(theta),0) </code></pre> If any <code>theta</code> are negative, then add2loss will be positive, otherwise zero. Scaling that to a meaningful value is left as an exercise to the reader. Scaling too little will not exert enough pressure. Too much may make things unstable.

What is the best way to implement weight constraints in TensorFlow?

Tags:

tensorflow

Suppose we have weights

x = tf.Variable(np.random.random((5,10))) cost = ...

And we use the GD optimizer:

upds = tf.train.GradientDescentOptimizer(lr).minimize(cost) session.run(upds)

How can we implement for example non-negativity on weights?

I tried clipping them:

upds = tf.train.GradientDescentOptimizer(lr).minimize(cost) session.run(upds) session.run(tf.assign(x, tf.clip_by_value(x, 0, np.infty)))

But this slows down my training by a factor of 50.

Does anybody know a good way to implement such constraints on the weights in TensorFlow?

P.S.: in the equivalent Theano algorithm, I had

T.clip(x, 0, np.infty)

and it ran smoothly.

360

asked Nov 13 '15 13:11

Denis L

2 Answers

You can take the Lagrangian approach and simply add a penalty for features of the variable you don't want.

e.g. To encourage theta to be non-negative, you could add the following to the optimizer's objective function.

    added_loss = -tf.minimum( tf.reduce_min(theta),0)

If any theta are negative, then add2loss will be positive, otherwise zero. Scaling that to a meaningful value is left as an exercise to the reader. Scaling too little will not exert enough pressure. Too much may make things unstable.

answered Sep 29 '22 19:09

Mark Borgerding

As of TensorFlow 1.4, there is a new argument to tf.get_variable that allows to pass a constraint function that is applied after the update of the optimizer. Here is an example that enforces a non-negativity constraint:

with tf.variable_scope("MyScope"):   v1 = tf.get_variable("v1", …, constraint=lambda x: tf.clip_by_value(x, 0, np.infty))

constraint: An optional projection function to be applied to the variable after being updated by an Optimizer (e.g. used to implement norm constraints or value constraints for layer weights). The function must take as input the unprojected Tensor representing the value of the variable and return the Tensor for the projected value (which must have the same shape). Constraints are not safe to use when doing asynchronous distributed training.

answered Sep 29 '22 19:09

Robin Dinse

Related questions
                            
                                Difference between installation libraries of Tensorflow GPU vs CPU
                            
                                Getting the current learning rate from a tf.train.AdamOptimizer
                            
                                tensorflow Mac OS gpu support
                            
                                Tensorflow: How to convert .meta, .data and .index model files into one graph.pb file
                            
                                What is difference frozen_inference_graph.pb and saved_model.pb?
                            
                                TensorFlow - Read all examples from a TFRecords at once?
                            
                                This model has not yet been built error on model.summary()
                            
                                How do I specify nvidia runtime from docker-compose.yml?
                            
                                ImportError: Failed to import any qt binding, Python - Tensorflow
                            
                                Update TensorFlow
                            
                                Machine Learning : Tensorflow v/s Tensorflow.js v/s Brain.js [closed]
                            
                                How can I make tensorflow run on a GPU with capability 2.x?
                            
                                Visualizing output of convolutional layer in tensorflow
                            
                                How to understand loss acc val_loss val_acc in Keras model fitting
                            
                                What is the meaning of the "None" in model.summary of KERAS?
                            
                                How to use tf.while_loop() in tensorflow
                            
                                What is the difference between model.fit() an model.evaluate() in Keras?
                            
                                Adam optimizer goes haywire after 200k batches, training loss grows
                            
                                TensorFlow 'module' object has no attribute 'global_variables_initializer'
                            
                                Illegal instruction (core dumped) after running import tensorflow

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With