In TensorFlow 2.0 with eager-execution, how to compute the gradients of a network output wrt a specific layer?

Tags:

I have a network made with InceptionNet, and for an input sample bx, I want to compute the gradients of the model output w.r.t. the hidden layer. I have the following code:

Click to copy

bx = tf.reshape(x_batch[0, :, :, :], (1, 299, 299, 3))


with tf.GradientTape() as gtape:
    #gtape.watch(x)
    preds = model(bx)
    print(preds.shape, end='  ')

    class_idx = np.argmax(preds[0])
    print(class_idx, end='   ')

    class_output = model.output[:, class_idx]
    print(class_output, end='   ')

    last_conv_layer = model.get_layer('inception_v3').get_layer('mixed10')
    #gtape.watch(last_conv_layer)
    print(last_conv_layer)


grads = gtape.gradient(class_output, last_conv_layer.output)#[0]
print(grads)

But, this will give None. I tried gtape.watch(bx) as well, but it still gives None.

Before trying GradientTape, I tried using tf.keras.backend.gradient but that gave an error as follows:

Click to copy

RuntimeError: tf.gradients is not supported when eager execution is enabled. Use tf.GradientTape instead.

My model is as follows:

Click to copy

model.summary()

Model: "sequential_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
inception_v3 (Model)         (None, 1000)              23851784  
_________________________________________________________________
dense_5 (Dense)              (None, 2)                 2002      
=================================================================
Total params: 23,853,786
Trainable params: 23,819,354
Non-trainable params: 34,432
_________________________________________________________________

Any solution is appreciated. It doesn't have to be GradientTape, if there is any other way to compute these gradients.

209

asked Jun 06 '19 13:06

Vahid Mirjalili

2 Answers

I had the same problem as you. I'm not sure if this is the cleanest way to solve the problem, but here's my solution.

I think the problem is that you need to pass along the actual return value of last_conv_layer.call(...) as an argument to tape.watch(). Since all layers are called sequentially within the scope of the model(bx) call, you'll have to somehow inject some code into this inner scope. I did this using the following decorator:

Click to copy

def watch_layer(layer, tape):
    """
    Make an intermediate hidden `layer` watchable by the `tape`.
    After calling this function, you can obtain the gradient with
    respect to the output of the `layer` by calling:

        grads = tape.gradient(..., layer.result)

    """
    def decorator(func):
        def wrapper(*args, **kwargs):
            # Store the result of `layer.call` internally.
            layer.result = func(*args, **kwargs)
            # From this point onwards, watch this tensor.
            tape.watch(layer.result)
            # Return the result to continue with the forward pass.
            return layer.result
        return wrapper
    layer.call = decorator(layer.call)
    return layer

In your example, I believe the following should then work for you:

Click to copy

bx = tf.reshape(x_batch[0, :, :, :], (1, 299, 299, 3))
last_conv_layer = model.get_layer('inception_v3').get_layer('mixed10')
with tf.GradientTape() as gtape:
    # Make the `last_conv_layer` watchable
    watch_layer(last_conv_layer, gtape)  
    preds = model(bx)
    class_idx = np.argmax(preds[0])
    class_output = model.output[:, class_idx]
# Get the gradient w.r.t. the output of `last_conv_layer`
grads = gtape.gradient(class_output, last_conv_layer.result)  
print(grads)

answered Oct 04 '22 11:10

Fantasty

You can use the tape to compute the gradient of an output node, wrt a set of watchable objects. By default, trainable variables are watchable by the tape, and you can access the trainable variables of a specific layer by getting it by name and accessing to the trainable_variables property.

E.g. in the code below, I compute the gradients of the prediction, only with respect to the variables of the first FC layer (name "fc1") considering any other variable a constant.

Click to copy

import tensorflow as tf

model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Dense(10, input_shape=(3,), name="fc1", activation="relu"),
        tf.keras.layers.Dense(3, input_shape=(3,), name="fc2"),
    ]
)

inputs = tf.ones((1, 299, 299, 3))

with tf.GradientTape() as tape:
    preds = model(inputs)

grads = tape.gradient(preds, model.get_layer("fc1").trainable_variables)
print(grads)

answered Oct 04 '22 11:10

nessuno

Related questions
                            
                                Uploading large files to Google Storage GCE from a Kubernetes pod
                            
                                Altair: Use of color scheme with log scales
                            
                                How to install multiple packages in conda from a file without creating a new environment?
                            
                                Time complexity of this algorithm: Word Ladder
                            
                                Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #2 'weight'
                            
                                VsCode Remote Debugging, change pythonpath to point to docker container's python interpreter
                            
                                pandas.core.indexing.IndexingError: Too many indexers
                            
                                Prediction step for time series using continuous hidden Markov models
                            
                                How to use Generic (higher-level) type variables in type hinting system?
                            
                                Selenium - Difference between text_to_be_present_in_element and text_to_be_present_in_element_value
                            
                                Obtain input_array and output_array items to convert model to tflite format
                            
                                Executing multiple lines of input in pycharm console from first line
                            
                                Jupyter notebook with Python 2 and Python3 Kernel
                            
                                Feature-wise scaling and shifting (FiLM layer) in Keras
                            
                                Django: using F() expressions on JSONField?
                            
                                Controlling stack-order of an altair area
                            
                                what's the difference between airflow's 'parallelism' and 'dag_concurrency'
                            
                                Why does this dict of 7 items only consume 368 bytes?
                            
                                Python ThreadPoolExecutor Suppress Exceptions
                            
                                Create a generic List from C# dll in python script

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

In TensorFlow 2.0 with eager-execution, how to compute the gradients of a network output wrt a specific layer?

Tags:

python

gradient

tensorflow

tensorflow2.0

tf.keras

Vahid Mirjalili

People also ask

2 Answers

Fantasty

nessuno

Recent Activity

Donate For Us