Tensorflow 2.0 doesn't compute the gradient

Tags:

I want to visualize the patterns that a given feature map in a CNN has learned (in this example I'm using vgg16). To do so I create a random image, feed through the network up to the desired convolutional layer, choose the feature map and find the gradients with the respect to the input. The idea is to change the input in such a way that will maximize the activation of the desired feature map. Using tensorflow 2.0 I have a GradientTape that follows the function and then computes the gradient, however the gradient returns None, why is it unable to compute the gradient?

import tensorflow as tf
import matplotlib.pyplot as plt
import time
import numpy as np
from tensorflow.keras.applications import vgg16

class maxFeatureMap():

    def __init__(self, model):

        self.model = model
        self.optimizer = tf.keras.optimizers.Adam()

    def getNumLayers(self, layer_name):

        for layer in self.model.layers:
            if layer.name == layer_name:
                weights = layer.get_weights()
                num = weights[1].shape[0]
        return ("There are {} feature maps in {}".format(num, layer_name))

    def getGradient(self, layer, feature_map):

        pic = vgg16.preprocess_input(np.random.uniform(size=(1,96,96,3))) ## Creates values between 0 and 1
        pic = tf.convert_to_tensor(pic)

        model = tf.keras.Model(inputs=self.model.inputs, 
                               outputs=self.model.layers[layer].output)
        with tf.GradientTape() as tape:
            ## predicts the output of the model and only chooses the feature_map indicated
            predictions = model.predict(pic, steps=1)[0][:,:,feature_map]
            loss = tf.reduce_mean(predictions)
        print(loss)
        gradients = tape.gradient(loss, pic[0])
        print(gradients)
        self.optimizer.apply_gradients(zip(gradients, pic))

model = vgg16.VGG16(weights='imagenet', include_top=False)


x = maxFeatureMap(model)
x.getGradient(1, 24)

304

asked Jul 06 '19 17:07

Will

2 Answers

This is a common pitfall with GradientTape; the tape only traces tensors that are set to be "watched" and by default tapes will watch only trainable variables (meaning tf.Variable objects created with trainable=True). To watch the pic tensor, you should add tape.watch(pic) as the very first line inside the tape context.

Also, I'm not sure if the indexing (pic[0]) will work, so you might want to remove that -- since pic has just one entry in the first dimension it shouldn't matter anyway.

Furthermore, you cannot use model.predict because this returns a numpy array, which basically "destroys" the computation graph chain so gradients won't be backpropagated. You should simply use the model as a callable, i.e. predictions = model(pic).

answered Oct 10 '22 00:10

xdurch0

Did you define your own loss function? Did you convert tensor to numpy in your loss function?

As a freshman, I also met the same problem: When using tape.gradient(loss, variables), it turns out None because I convert tensor to numpy array in my own loss function. It seems to be a stupid but common mistake for freshman.

answered Oct 10 '22 00:10

LuTan

Related questions
                            
                                Reading Data From Cloud Storage Via Cloud Functions
                            
                                Select rows with highest value from groupby
                            
                                How to delete list item automatically in python?
                            
                                Python 'while' with two conditions: "and" or "or"
                            
                                Get the average color inside a contour with Open CV
                            
                                ValueError: index must be monotonic when applying rolling("2H").mean()
                            
                                Joining on datetime64[ns, UTC] fails using pandas.join
                            
                                TypeError ('module' object is not callable) using dateutil relativedelta
                            
                                In sklearn regression, is there a command to return residuals for all records?
                            
                                F String Invalid Syntax in Python 3.5 [closed]
                            
                                SQLAlchemy: How to add column to existing table?
                            
                                No module named xlsxwriter error while writing pandas df to excel
                            
                                No matching distribution found for Django==2.2
                            
                                TypeError: can't pickle _thread._local objects when using dask on pandas DataFrame
                            
                                Getting AWS State machine execution ARN inside the step function
                            
                                How to get system path to installed packages in Google Colab?
                            
                                Qt WebEngine seems to be initialized
                            
                                Element wise concatenate multiple lists (list of list of strings)
                            
                                How to create a CLI in Python that can be installed with PIP?
                            
                                Python decompression relative performance?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Tensorflow 2.0 doesn't compute the gradient

Tags:

python

tensorflow

gradient-descent

conv-neural-network

Will

People also ask

2 Answers

xdurch0

LuTan

Recent Activity

Donate For Us