What is the best way to implement weight noise in Tensorflow? Should I extract all the weights and apply noise? Or should I apply noise to the gradients?
Weight noise and gradient noise are not the same.
There is a weight noise with following weight update:
And gradient noise (sometimes called Langevin noise) that applies gradients as following:
The latter is the update of Stochastic Gradient Langevin Dynamics optimizer.
In any case, it is pretty straighforward to implement both in tensorflow.
# Assuming you defined a graph and loss function `loss` and noise
# is drawn from normal distribution
# Weight noise:
optimizer = tf.train.GradientDescentOptimizer(lr)
grads_and_vars = optimizer.compute_gradients(loss, tf.trainable_variables())
train_ops = [tf.assign(v,
v - lr*g + tf.random_normal(v.shape, stddev=0.1))
for g, v in grads_and_vars]
train_op = tf.group(train_ops)
# Langevin noise:
optimizer = tf.train.GradientDescentOptimizer(lr)
grads_and_vars = optimizer.compute_gradients(loss, tf.trainable_variables())
train_ops = [tf.assign(v,
v - lr*g - tf.sqrt(lr*T)*tf.random_normal(v.shape, stddev=1))
for g, v in grads_and_vars]
train_op = tf.group(train_ops)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With