Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorboard - visualize weights of LSTM

I am using several LSTM layers to form a deep recurrent neural network. I would like to monitor the weights of each LSTM layer during training. However, I couldn't find out how to attach summaries of the LSTM layer weights to TensorBoard.

Any suggestions on how this can be done?

The code is as follows:

cells = []

with tf.name_scope("cell_1"):
    cell1 = tf.contrib.rnn.LSTMCell(self.embd_size, state_is_tuple=True, initializer=self.initializer)
    cell1 = tf.contrib.rnn.DropoutWrapper(cell1,
                input_keep_prob=self.input_dropout,
                output_keep_prob=self.output_dropout,
                state_keep_prob=self.recurrent_dropout)
    cells.append(cell1)

with tf.name_scope("cell_2"):
    cell2 = tf.contrib.rnn.LSTMCell(self.n_hidden, state_is_tuple=True, initializer=self.initializer)
    cell2 = tf.contrib.rnn.DropoutWrapper(cell2,
                output_keep_prob=self.output_dropout,
                state_keep_prob=self.recurrent_dropout)
    cells.append(cell2)

with tf.name_scope("cell_3"):
    cell3 = tf.contrib.rnn.LSTMCell(self.embd_size, state_is_tuple=True, initializer=self.initializer)
    # cell has no input dropout since previous cell already has output dropout
    cell3 = tf.contrib.rnn.DropoutWrapper(cell3,
                output_keep_prob=self.output_dropout,
                state_keep_prob=self.recurrent_dropout)
    cells.append(cell3)

cell = tf.contrib.rnn.MultiRNNCell(
    cells, state_is_tuple=True)

output, self.final_state = tf.nn.dynamic_rnn(
    cell,
    inputs=self.inputs,
    initial_state=self.init_state)
like image 374
Lemon Avatar asked Dec 04 '17 19:12

Lemon


1 Answers

tf.contrib.rnn.LSTMCell objects have a property called variables that works for this. There's just one trick: The property returns an empty list until your cell goes through tf.nn.dynamic_rnn. At least this is the case when using a single LSTMCell. I can't speak for MultiRNNCell. So I expect this would work:

output, self.final_state = tf.nn.dynamic_rnn(...)
for one_lstm_cell in cells:
    one_kernel, one_bias = one_lstm_cell.variables
    # I think TensorBoard handles summaries with the same name fine.
    tf.summary.histogram("Kernel", one_kernel)
    tf.summary.histogram("Bias", one_bias)

And then you probably know how to do it from there, but

summary_op = tf.summary.merge_all()
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    train_writer = tf.summary.FileWriter(
        "my/preferred/logdir/train", graph=tf.get_default_graph())
    for step in range(1, training_steps+1):
        ...
        _, step_summary = sess.run([train_op, summary_op])
        train_writer.add_summary(step_summary)

Looking at the TensorFlow documentation I linked above, there's also a weights property. I don't know the difference, if there is any. And, the order of the variables return isn't documented. I figured it out by printing the resulting list and looking at the variable names.

Now, MultiRNNCell has the same variables property according to its doc and it says it returns all layer variables. I honestly don't know how MultiRNNCell works, so I cannot tell you whether these are variables belonging exclusively to MultiRNNCell or if it includes variables from the cells that go into it. Either way, knowing the property exists should be a nice tip! Hope this helps.


Although variables is documented for most (all?) RNN classes, it does break for DropoutWrapper. The property has been documented since r1.2, but accessing the property causes an exception in 1.2 and 1.4 (and looks like 1.3, but untested). Specifically,

from tensorflow.contrib import rnn
...
lstm_cell = rnn.BasicLSTMCell(num_hidden, forget_bias=1.0)
wrapped_cell = rnn.DropoutWrapper(lstm_cell)
outputs, states = rnn.static_rnn(wrapped_cell, x, dtype=tf.float32)
print("LSTM vars!", lstm_cell.variables)
print("Wrapped vars!", wrapped_cell.variables)

will throw AttributeError: 'DropoutWrapper' object has no attribute 'trainable'. From the traceback (or a long stare at the DropoutWrapper source), I noticed that variables is implemented in DropoutWrapper's super RNNCell's super Layer. Dizzy yet? Indeed, we find the documented variables property here. It returns the (documented) weights property. The weights property returns the (documented) self.trainable_weights + self.non_trainable_weights properties. And finally the root of the problem:

@property
def trainable_weights(self):
    return self._trainable_weights if self.trainable else []

@property
def non_trainable_weights(self):
    if self.trainable:
        return self._non_trainable_weights
    else:
        return self._trainable_weights + self._non_trainable_weights

That is, variables does not work for a DropoutWrapper instance. Neither will trainable_weights or non_trainable_weights sinceself.trainable is not defined.

One step deeper, Layer.__init__ defaults self.trainable to True, but DropoutWrapper never calls it. To quote a TensorFlow contributor on Github,

DropoutWrapper does not have variables because it does not itself store any. It wraps a cell that may have variables; but it's not clear what the semantics should be if you access the DropoutWrapper.variables. For example, all keras layers only report back the variables that they own; and so only one layer ever owns any variable. That said, this should probably return [], and the reason it doesn't is that DropoutWrapper never calls super().__init__ in its constructor. That should be an easy fix; PRs welcome.

So for instance, to access the LSTM variables in the above example, lstm_cell.variables suffices.


Edit: To the best of my knowledge, Mike Khan's PR has been incorporated into 1.5. Now, the variables property of the dropout layer returns an empty list.

like image 83
Dylan F Avatar answered Oct 04 '22 20:10

Dylan F