Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is AdamOptimizer duplicated in my graph?

I am fairly new to the internals of TensorFlow. Towards trying to understand TensorFlow's implementation of AdamOptimizer, I checked the corresponding subgraph in TensorBoard. There seems to be a duplicate subgraph named name + '_1', where name='Adam' by default.

The following MWE produces the graph below. (Note that I have expanded the x node!)

import tensorflow as tf

tf.reset_default_graph()
x = tf.Variable(1.0, name='x')
train_step = tf.train.AdamOptimizer(1e-1, name='MyAdam').minimize(x)

init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    with tf.summary.FileWriter('./logs/mwe') as writer:
        writer.add_graph(sess.graph)

enter image description here

I am confused because I would expect the above code to produce just a single namespace inside the graph. Even after examining the relevant source files (namely adam.py, optimizer.py and training_ops.cc), it's not clear to me how/why/where the duplicate is created.

Question: What is the source of the duplicate AdamOptimizer subgraph?

I can think of the following possibilities:

  • A bug in my code
  • Some sort of artifact generated in TensorBoard
  • This is expected behavior (if so, then why?)
  • A bug in TensorFlow

Edit: Cleanup and clarification

Due to some initial confusion, I cluttered my original question with detailed instructions for how to set up a reproducible environment with TensorFlow/TensorBoard which reproduces this graph. I have now replaced all that with the clarification about expanding the x node.

like image 245
Ben Mares Avatar asked Mar 15 '19 00:03

Ben Mares


1 Answers

This is not a bug, just a perhaps questionable way of leaking outside of your own scope.

First, not a bug: The Adam optimizer is not duplicated. As can be seen in your graph, there is a single /MyAdam scope, not two. No problem here.

However, there are two MyAdam and MyAdam_1 subscopes added to your variable scope. They correspond respectively to the m and v variables (and their initialization operations) of the Adam optimizer for this variable.

This is where choices made by the optimizer are debatable. You could indeed reasonably expect the Adam optimizer operations and variables to be strictly defined within its assigned scope. Instead, they choose to creep in the optimized variables' scope to locate the statistics variables.

So, debatable choice to say the least, but not a bug, in the sense that the Adam optimizer is indeed not duplicated.

EDIT

Note that this way of locating variables is common across optimizers -- you can observe the same effect with a MomentumOptimizer for example. Indeed, this is the standard way of creating slots for optimizers -- see here:

# Scope the slot name in the namespace of the primary variable.
# Set "primary.op.name + '/' + name" as default name, so the scope name of
# optimizer can be shared when reuse is True. Meanwhile when reuse is False
# and the same name has been previously used, the scope name will add '_N'
# as suffix for unique identifications.

So as I understand it, they chose to locate the statistics of a variable within a subscope of the scope of the variable itself, so that if the variable is shared/reused, then its statistics are also shared/reused and do not need to be recomputed. This is indeed a reasonable thing to do, even if again, creeping outside of your scope is somewhat unsettling.

like image 102
P-Gn Avatar answered Sep 19 '22 08:09

P-Gn