I'm trying to use Keras to implement part of an algorithm that requires weight clipping, i.e. limiting the weight values after a gradient update. I haven't found any solutions through web searches so far. For background, this has to do with the WGANs algorithm: https://arxiv.org/pdf/1701.07875.pdf If you look at algorithm 1 on page 8, you'll see the following: <img src="https://i.stack.imgur.com/x7ZSN.png" alt="enter image description here"> I've highlighted the lines that I'm trying to implement in Keras: after computing a gradient to use to update the weights in the network, I want to make sure that all the weights are clipped between some values [-c, c] that I can set. How could I go about doing this in Keras? For reference I am using the TensorFlow backend. I don't mind digging into things and adding messy quick-fixes for now.

While creating the optimizer object set param <code>clipvalue</code>. <s>It will do precisely what you want.</s> <pre class="prettyprint"><code># all parameter gradients will be clipped to # a maximum value of 0.5 and # a minimum value of -0.5. rsmprop = RMSprop(clipvalue=0.5) </code></pre> and then use this object to for model compiling <pre class="prettyprint"><code>model.compile(loss='mse', optimizer=rsmprop) </code></pre> For more reference check: here. Also, I prefer to use <code>clipnorm</code> over <code>clipvalue</code> because with <code>clipnorm</code> the optimization remains stable. For example say you have 2 parameters and the gradients came out to be <code>[0.1, 3]</code>. By using <code>clipvalue</code> the gradients will become [0.1, 0.5] ie there are chances that the direction of steepest decent can get changed drastically. While <code>clipnorm</code> don't have similar problem as all the gradients will be appropriately scaled and the direction will be preserved and all the while ensuring the constraint on the magnitude of the gradient. Edit: The question asks weights clipping not gradient clipping: Gradiant clipping on weights is not part of keras code. But <code>maxnorm</code> on weights constraints is. Check here. Having said that it can be easily implemented. Here is a very small example: <pre class="prettyprint"><code>from keras.constraints import Constraint from keras import backend as K class WeightClip(Constraint): '''Clips the weights incident to each hidden unit to be inside a range ''' def __init__(self, c=2): self.c = c def __call__(self, p): return K.clip(p, -self.c, self.c) def get_config(self): return {'name': self.__class__.__name__, 'c': self.c} import numpy as np from keras.models import Sequential from keras.layers import Dense model = Sequential() model.add(Dense(30, input_dim=100, W_constraint = WeightClip(2))) model.add(Dense(1)) model.compile(loss='mse', optimizer='rmsprop') X = np.random.random((1000,100)) Y = np.random.random((1000,1)) model.fit(X,Y) </code></pre> I have tested the running of the above code, but not the validity of the constraints. You can do so by getting the model weights after training using <code>model.get_weights()</code> or <code>model.layers[idx].get_weights()</code> and checking whether its abiding the constraints. Note: The constrain is not added to all the model weights .. but just to the weights of the specific layer its used and also <code>W_constraint</code> adds constrain to <code>W</code> param and <code>b_constraint</code> to <code>b</code> (bias) param

Keras ML library: how to do weight clipping after gradient updates? TensorFlow backend

Tags:

tensorflow

keras

I'm trying to use Keras to implement part of an algorithm that requires weight clipping, i.e. limiting the weight values after a gradient update. I haven't found any solutions through web searches so far.

For background, this has to do with the WGANs algorithm:

https://arxiv.org/pdf/1701.07875.pdf

If you look at algorithm 1 on page 8, you'll see the following:

enter image description here

I've highlighted the lines that I'm trying to implement in Keras: after computing a gradient to use to update the weights in the network, I want to make sure that all the weights are clipped between some values [-c, c] that I can set.

How could I go about doing this in Keras?

For reference I am using the TensorFlow backend. I don't mind digging into things and adding messy quick-fixes for now.

539

asked Feb 16 '17 03:02

JDS

1 Answers

While creating the optimizer object set param clipvalue. ~~It will do precisely what you want.~~

# all parameter gradients will be clipped to
# a maximum value of 0.5 and
# a minimum value of -0.5.
rsmprop = RMSprop(clipvalue=0.5)

and then use this object to for model compiling

model.compile(loss='mse', optimizer=rsmprop)

For more reference check: here.

Also, I prefer to use clipnorm over clipvalue because with clipnorm the optimization remains stable. For example say you have 2 parameters and the gradients came out to be [0.1, 3]. By using clipvalue the gradients will become [0.1, 0.5] ie there are chances that the direction of steepest decent can get changed drastically. While clipnorm don't have similar problem as all the gradients will be appropriately scaled and the direction will be preserved and all the while ensuring the constraint on the magnitude of the gradient.

Edit: The question asks weights clipping not gradient clipping:

Gradiant clipping on weights is not part of keras code. But maxnorm on weights constraints is. Check here.

Having said that it can be easily implemented. Here is a very small example:

from keras.constraints import Constraint
from keras import backend as K

class WeightClip(Constraint):
    '''Clips the weights incident to each hidden unit to be inside a range
    '''
    def __init__(self, c=2):
        self.c = c

    def __call__(self, p):
        return K.clip(p, -self.c, self.c)

    def get_config(self):
        return {'name': self.__class__.__name__,
                'c': self.c}

import numpy as np
from keras.models import Sequential
from keras.layers import Dense

model = Sequential()
model.add(Dense(30, input_dim=100, W_constraint = WeightClip(2)))
model.add(Dense(1))

model.compile(loss='mse', optimizer='rmsprop')

X = np.random.random((1000,100))
Y = np.random.random((1000,1))

model.fit(X,Y)

I have tested the running of the above code, but not the validity of the constraints. You can do so by getting the model weights after training using model.get_weights() or model.layers[idx].get_weights() and checking whether its abiding the constraints.

Note: The constrain is not added to all the model weights .. but just to the weights of the specific layer its used and also W_constraint adds constrain to W param and b_constraint to b (bias) param

181

answered Oct 17 '22 01:10

indraforyou

Related questions
                            
                                Tensorflow documentation's example code on "Logging Device Placement" doesn't print out anything
                            
                                Deploying Keras Models via Google Cloud ML
                            
                                What are the advantages of using tf.train.SequenceExample over tf.train.Example for variable length features?
                            
                                How to import the tensorflow lite interpreter in Python?
                            
                                Using a GPU both as video card and GPGPU
                            
                                TensorFlow: slow performance when getting gradients at inputs
                            
                                tf.data.Dataset.padded_batch pad differently each feature
                            
                                How to cache data during the first epoch correctly (Tensorflow, dataset)?
                            
                                Tensorflow Convert pb file to TFLITE using python
                            
                                Can cond support TF ops with side effects?
                            
                                TensorFlow: Restoring variables from from multiple checkpoints
                            
                                How can I compute element-wise conditionals on batches in TensorFlow?
                            
                                What does the error: `Loaded runtime CuDNN library: 5005 but source was compiled with 5103` mean?
                            
                                how to pip install 64 bit packages while having both 64 bit and 32 bit versions?
                            
                                Tensorflow on Docker: How to save the work on Jupyter notebook?
                            
                                Shut down server in TensorFlow
                            
                                Tensorflow while loop : dealing with lists
                            
                                Where Dropout should be inserted.? Fully Connected Layer.? Convolutional Layer.? or Both.? [closed]
                            
                                How to fill a tensor in C++
                            
                                How does "tf.train.replica_device_setter" work?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With