TensorFlow 2.0: how to group graph using tf.keras? tf.name_scope/tf.variable_scope not used anymore?

Question

Back in TensorFlow < 2.0 we used to define layers, especially more complex setups like inception modules for example, by grouping them with tf.name_scope or tf.variable_scope.

Utilizing these operators we were able to conveniently structure the compute graph, which results in TensorBoard's graph view being interpretable much easier.

Just one example for the structured groups: enter image description here

This comes in very handy for debugging complex architectures.

Unfortunately, tf.keras seems to ignore tf.name_scope and tf.variable_scope is gone in TensorFlow >= 2.0. Thus, a solution like this...

with tf.variable_scope("foo"):
    with tf.variable_scope("bar"):
        v = tf.get_variable("v", [1])
        assert v.name == "foo/bar/v:0"

...is not available anymore. Is there any replacement?

How can we group layers and entire models in TensorFlow >= 2.0? If we do not group layers, tf.keras is creating a big mess for complex models by just placing everything serially in the graph view.

Is there a replacement for tf.variable_scope? I could not find any so far, but made heavy use of the method in TensorFlow < 2.0.

EDIT: I have now implemented an example for TensorFlow 2.0. This is a simple GAN implemented using tf.keras:

# Generator
G_inputs = tk.Input(shape=(100,), name=f"G_inputs")

x = tk.layers.Dense(7 * 7 * 16)(G_inputs)
x = tf.nn.leaky_relu(x)
x = tk.layers.Flatten()(x)
x = tk.layers.Reshape((7, 7, 16))(x)

x = tk.layers.Conv2DTranspose(32, (3, 3), padding="same")(x)
x = tk.layers.BatchNormalization()(x)
x = tf.nn.leaky_relu(x)
x = tf.image.resize(x, (14, 14))

x = tk.layers.Conv2DTranspose(32, (3, 3), padding="same")(x)
x = tk.layers.BatchNormalization()(x)
x = tf.nn.leaky_relu(x)
x = tf.image.resize(x, (28, 28))

x = tk.layers.Conv2DTranspose(32, (3, 3), padding="same")(x)
x = tk.layers.BatchNormalization()(x)
x = tf.nn.leaky_relu(x)

x = tk.layers.Conv2DTranspose(1, (3, 3), padding="same")(x)
x = tf.nn.sigmoid(x)

G_model = tk.Model(inputs=G_inputs,
                   outputs=x,
                   name="G")
G_model.summary()

# Discriminator
D_inputs = tk.Input(shape=(28, 28, 1), name=f"D_inputs")

x = tk.layers.Conv2D(32, (3, 3), padding="same")(D_inputs)
x = tf.nn.leaky_relu(x)
x = tk.layers.MaxPooling2D((2, 2))(x)
x = tk.layers.Conv2D(32, (3, 3), padding="same")(x)
x = tf.nn.leaky_relu(x)
x = tk.layers.MaxPooling2D((2, 2))(x)
x = tk.layers.Conv2D(64, (3, 3), padding="same")(x)
x = tf.nn.leaky_relu(x)

x = tk.layers.Flatten()(x)

x = tk.layers.Dense(128)(x)
x = tf.nn.sigmoid(x)
x = tk.layers.Dense(64)(x)
x = tf.nn.sigmoid(x)
x = tk.layers.Dense(1)(x)
x = tf.nn.sigmoid(x)

D_model = tk.Model(inputs=D_inputs,
                   outputs=x,
                   name="D")

D_model.compile(optimizer=tk.optimizers.Adam(learning_rate=1e-5, beta_1=0.5, name="Adam_D"),
                loss="binary_crossentropy")
D_model.summary()

GAN = tk.Sequential()
GAN.add(G_model)
GAN.add(D_model)
GAN.compile(optimizer=tk.optimizers.Adam(learning_rate=1e-5, beta_1=0.5, name="Adam_GAN"),
            loss="binary_crossentropy")

tb = tk.callbacks.TensorBoard(log_dir="./tb_tf2.0", write_graph=True)

# dummy data
noise = np.random.rand(100, 100).astype(np.float32)
target = np.ones(shape=(100, 1), dtype=np.float32)

GAN.fit(x=noise,
        y=target,
        callbacks=[tb])

The graph in TensorBoard of these models looks like this. The layers are just a complete mess and also the models "G" and "D" (on the right side) covers some mess. "GAN" is completely missing. The training operation "Adam" cannot be opened properly: too many layers just plotted from left to right and arrows all over the place. Very hard to check the correctness of your GAN this way.

Althought a TensorFlow 1.X implementation of the same GAN covers lots of "boilerplate code"...

# Generator
Z = tf.placeholder(tf.float32, shape=[None, 100], name="Z")


def model_G(inputs, reuse=False):
    with tf.variable_scope("G", reuse=reuse):
        x = tf.layers.dense(inputs, 7 * 7 * 16)
        x = tf.nn.leaky_relu(x)
        x = tf.reshape(x, (-1, 7, 7, 16))

        x = tf.layers.conv2d_transpose(x, 32, (3, 3), padding="same")
        x = tf.layers.batch_normalization(x)
        x = tf.nn.leaky_relu(x)
        x = tf.image.resize_images(x, (14, 14))

        x = tf.layers.conv2d_transpose(x, 32, (3, 3), padding="same")
        x = tf.layers.batch_normalization(x)
        x = tf.nn.leaky_relu(x)
        x = tf.image.resize_images(x, (28, 28))

        x = tf.layers.conv2d_transpose(x, 32, (3, 3), padding="same")
        x = tf.layers.batch_normalization(x)
        x = tf.nn.leaky_relu(x)

        x = tf.layers.conv2d_transpose(x, 1, (3, 3), padding="same")
        G_logits = x
        G_out = tf.nn.sigmoid(x)

    return G_logits, G_out


# Discriminator
D_in = tf.placeholder(tf.float32, shape=[None, 28, 28, 1], name="D_in")


def model_D(inputs, reuse=False):
    with tf.variable_scope("D", reuse=reuse):
        with tf.variable_scope("conv"):
            x = tf.layers.conv2d(inputs, 32, (3, 3), padding="same")
            x = tf.nn.leaky_relu(x)
            x = tf.layers.max_pooling2d(x, (2, 2), (2, 2))
            x = tf.layers.conv2d(x, 32, (3, 3), padding="same")
            x = tf.nn.leaky_relu(x)
            x = tf.layers.max_pooling2d(x, (2, 2), (2, 2))
            x = tf.layers.conv2d(x, 64, (3, 3), padding="same")
            x = tf.nn.leaky_relu(x)

        with tf.variable_scope("dense"):
            x = tf.reshape(x, (-1, 7 * 7 * 64))

            x = tf.layers.dense(x, 128)
            x = tf.nn.sigmoid(x)
            x = tf.layers.dense(x, 64)
            x = tf.nn.sigmoid(x)
            x = tf.layers.dense(x, 1)
            D_logits = x
            D_out = tf.nn.sigmoid(x)

    return D_logits, D_out

# models
G_logits, G_out = model_G(Z)
D_logits, D_out = model_D(D_in)
GAN_logits, GAN_out = model_D(G_out, reuse=True)

# losses
target = tf.placeholder(tf.float32, shape=[None, 1], name="target")
d_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=D_logits, labels=target))
gan_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=GAN_logits, labels=target))

# train ops
train_d = tf.train.AdamOptimizer(learning_rate=1e-5, name="AdamD") \
    .minimize(d_loss, var_list=tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope="D"))
train_gan = tf.train.AdamOptimizer(learning_rate=1e-5, name="AdamGAN") \
    .minimize(gan_loss, var_list=tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope="G"))

# dummy data
dat_noise = np.random.rand(100, 100).astype(np.float32)
dat_target = np.ones(shape=(100, 1), dtype=np.float32)

sess = tf.Session()
tf_init = tf.global_variables_initializer()
sess.run(tf_init)

# merged = tf.summary.merge_all()
writer = tf.summary.FileWriter("./tb_tf1.0", sess.graph)

ret = sess.run([gan_loss, train_gan], feed_dict={Z: dat_noise, target: dat_target})

...the resulting TensorBoard graph looks considerably cleaner. Notice how clean "AdamD" and "AdamGAN" scopes are on the top right. You can directly check that your optimizers are attached to the right scopes / gradients.

jdehesa · Accepted Answer

According to the community RFC Variables in TensorFlow 2.0:

to control variable naming users can use tf.name_scope + tf.Variable

Indeed, tf.name_scope still exists in TensorFlow 2.0, so you can just do:

with tf.name_scope("foo"):
    with tf.name_scope("bar"):
        v = tf.Variable([0], dtype=tf.float32, name="v")
        assert v.name == "foo/bar/v:0"

Also, as the point above that states:

the tf 1.0 version of variable_scope and get_variable will be left in tf.compat.v1

So you can just fall back to tf.compat.v1.variable_scope and tf.compat.v1.get_variable if you really need to.

Variable scopes and tf.get_variable can be convenient but are riddled with minor pitfalls and corner cases, specially since they behave similarly but not exactly like name scopes, and it is actually a parallel mechanism to it. I think having just name scopes will be more consistent and straightforward.

TensorFlow 2.0: how to group graph using tf.keras? tf.name_scope/tf.variable_scope not used anymore?

Tags:

daniel451

1 Answers

jdehesa

Recent Activity

Donate For Us

TensorFlow 2.0: how to group graph using tf.keras? tf.name_scope/tf.variable_scope not used anymore?

Tags:

daniel451

1 Answers

jdehesa

Related questions

Recent Activity

Donate For Us