TensorBoard had the function to plot histograms of Tensors at session-time. I want a histogram for the gradients during training.
tf.gradients(yvars,xvars)
returns a list a gradients.
However, tf.histogram_summary('name',Tensor)
accepts only Tensors, not lists of Tensors.
For the time being, I made a work-around. I flatten all Tensors to a column vector and concatenate them:
for l in xrange(listlength):
col_vec = tf.reshape(grads[l],[-1,1])
g = tf.concat(0,[g,col_vec])
grad_hist = tf.histogram_summary("name", g)
What would be a better way to plot the histogram for the gradient?
It seems a common thing to do, so I hope TensorFlow would have a dedicated function for this.
Another solution (based on this quora answer) is to access the gradients directly from the optimizer you are already using.
optimizer = tf.train.AdamOptimizer(..)
grads = optimizer.compute_gradients(loss)
grad_summ_op = tf.summary.merge([tf.summary.histogram("%s-grad" % g[1].name, g[0]) for g in grads])
grad_vals = sess.run(fetches=grad_summ_op, feed_dict = feed_dict)
writer['train'].add_summary(grad_vals)
Following the suggestion from @user728291, I was able to view gradients in tensorboard by using the the optimize_loss
function as follows.
The function calling syntax for optimize_loss is
optimize_loss(
loss,
global_step,
learning_rate,
optimizer,
gradient_noise_scale=None,
gradient_multipliers=None,
clip_gradients=None,
learning_rate_decay_fn=None,
update_ops=None,
variables=None,
name=None,
summaries=None,
colocate_gradients_with_ops=False,
increment_global_step=True
)
The function requires global_step
and is dependent on some other imports as shown next.
from tensorflow.python.ops import variable_scope
from tensorflow.python.framework import dtypes
from tensorflow.python.ops import init_ops
global_step = variable_scope.get_variable( # this needs to be defined for tf.contrib.layers.optimize_loss()
"global_step", [],
trainable=False,
dtype=dtypes.int64,
initializer=init_ops.constant_initializer(0, dtype=dtypes.int64))
Then replace your typical training operation
training_operation = optimizer.minimize(loss_operation)
with
training_operation = tf.contrib.layers.optimize_loss(
loss_operation, global_step, learning_rate=rate, optimizer='Adam',
summaries=["gradients"])
Then have a merge statement for your summaries
summary = tf.summary.merge_all()
Then in your tensorflow session at the end of each run/epoch:
summary_writer = tf.summary.FileWriter(logdir_run_x, sess.graph)
summary_str = sess.run(summary, feed_dict=feed_dict)
summary_writer.add_summary(summary_str, i)
summary_writer.flush() # evidently this is needed sometimes or scalars will not show up on tensorboard.
Where logdir_run_x
is a different directory for each run. That way when TensorBoard runs, you can look at each run separately. The gradients will be under the histogram tab and will have the label OptimizeLoss
. It will show all the weights, all the biases, and the beta
parameter as histograms.
UPDATE: Using tf slim, there is another way that also works and is perhaps cleaner.
optimizer = tf.train.AdamOptimizer(learning_rate = rate)
training_operation = slim.learning.create_train_op(loss_operation, optimizer,summarize_gradients=True)
By setting summarize_gradients=True
, which is not the default, you will then get gradient summaries for all weights. These will be viewable in Tensorboard under summarize_grads
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With