Tf 2.0 : RuntimeError: GradientTape.gradient can only be called once on non-persistent tapes

Question

In tf 2.0 DC Gan example in tensorflow 2.0 guide, there are two gradient tapes . See below.

@tf.function
def train_step(images):
    noise = tf.random.normal([BATCH_SIZE, noise_dim])

    with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
      generated_images = generator(noise, training=True)

      real_output = discriminator(images, training=True)
      fake_output = discriminator(generated_images, training=True)

      gen_loss = generator_loss(fake_output)
      disc_loss = discriminator_loss(real_output, fake_output)

    gradients_of_generator = gen_tape.gradient(gen_loss, generator.trainable_variables)
    gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_variables)

    generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_variables))
    discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))

As you can see clearly that there are two gradient tapes. I was wondering what difference does using a single tape make and changed it to the following

@tf.function
def train_step(images):
    noise = tf.random.normal([BATCH_SIZE, noise_dim])

    with tf.GradientTape() as tape:
      generated_images = generator(noise, training=True)

      real_output = discriminator(images, training=True)
      fake_output = discriminator(generated_images, training=True)

      gen_loss = generator_loss(fake_output)
      disc_loss = discriminator_loss(real_output, fake_output)

    gradients_of_generator = tape.gradient(gen_loss, generator.trainable_variables)
    gradients_of_discriminator = tape.gradient(disc_loss, discriminator.trainable_variables)

    generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_variables))
    discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))

This gives me the following error :

RuntimeError: GradientTape.gradient can only be called once on non-persistent tapes.

I would like to know why two tapes are necessary. As of now the documentation on tf2.0 APIs is scanty. Can anyone explain or point me to the right docs/tutorials?

Sparky05 · Accepted Answer

From the documentation of GradientTape:

By default, the resources held by a GradientTape are released as soon as GradientTape.gradient() method is called. To compute multiple gradients over the same computation, create a persistent gradient tape. This allows multiple calls to the gradient() method as resources are released when the tape object is garbage collected.

A persistent gradient can be created with with tf.GradientTape(persistent=True) as tape and can/should be manually deleted with del tape (credits for this @zwep, @Crispy13).

P-Gn · Answer

The technical reason is that gradient is called twice, which is not allowed on (non-persistent) tapes.

In the present case however, the underlying reason is that training of GANS is typically done by alternating the optimization of the generator and the discriminator. Each optimization has its own optimizer which typically operate on different variables, and nowadays even the loss that is minimized is different (gen_loss and disc_loss in your code).

So you end up with two gradients because training GANs is essentially optimizing two different (adversarial) problems in an alternating fashion.

Tf 2.0 : RuntimeError: GradientTape.gradient can only be called once on non-persistent tapes

Tags:

python

python-3.x

tensorflow

tensorflow2.0

Himaprasoon

2 Answers

Sparky05

P-Gn

Recent Activity

Donate For Us

Tf 2.0 : RuntimeError: GradientTape.gradient can only be called once on non-persistent tapes

Tags:

python

python-3.x

tensorflow

tensorflow2.0

Himaprasoon

2 Answers

Sparky05

P-Gn

Related questions

Recent Activity

Donate For Us