Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the best way to implement weight constraints in TensorFlow?

Tags:

tensorflow

Suppose we have weights

x = tf.Variable(np.random.random((5,10))) cost = ... 

And we use the GD optimizer:

upds = tf.train.GradientDescentOptimizer(lr).minimize(cost) session.run(upds) 

How can we implement for example non-negativity on weights?

I tried clipping them:

upds = tf.train.GradientDescentOptimizer(lr).minimize(cost) session.run(upds) session.run(tf.assign(x, tf.clip_by_value(x, 0, np.infty))) 

But this slows down my training by a factor of 50.

Does anybody know a good way to implement such constraints on the weights in TensorFlow?

P.S.: in the equivalent Theano algorithm, I had

T.clip(x, 0, np.infty) 

and it ran smoothly.

like image 360
Denis L Avatar asked Nov 13 '15 13:11

Denis L


People also ask

What is a weight constraint?

A weight constraint is an update to the network that checks the size of the weights, and if the size exceeds a predefined limit, the weights are rescaled so that their size is below the limit or between a range.

What are constraints in keras?

keras. constraints module allow setting constraints (eg. non-negativity) on model parameters during training. They are per-variable projection functions applied to the target variable after each gradient update (when using fit() ).

How do you set weights in keras?

Use the get_weights() function to get the weights and biases of the layers before training the model. These are the weights and biases with which the layers will be initialized.


2 Answers

You can take the Lagrangian approach and simply add a penalty for features of the variable you don't want.

e.g. To encourage theta to be non-negative, you could add the following to the optimizer's objective function.

    added_loss = -tf.minimum( tf.reduce_min(theta),0) 

If any theta are negative, then add2loss will be positive, otherwise zero. Scaling that to a meaningful value is left as an exercise to the reader. Scaling too little will not exert enough pressure. Too much may make things unstable.

like image 82
Mark Borgerding Avatar answered Sep 29 '22 19:09

Mark Borgerding


As of TensorFlow 1.4, there is a new argument to tf.get_variable that allows to pass a constraint function that is applied after the update of the optimizer. Here is an example that enforces a non-negativity constraint:

with tf.variable_scope("MyScope"):   v1 = tf.get_variable("v1", …, constraint=lambda x: tf.clip_by_value(x, 0, np.infty)) 

constraint: An optional projection function to be applied to the variable after being updated by an Optimizer (e.g. used to implement norm constraints or value constraints for layer weights). The function must take as input the unprojected Tensor representing the value of the variable and return the Tensor for the projected value (which must have the same shape). Constraints are not safe to use when doing asynchronous distributed training.

like image 35
Robin Dinse Avatar answered Sep 29 '22 19:09

Robin Dinse