Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I get the gradient of a keras model with respect to its inputs?

I just asked a question on the same topic but for custom models (How do I find the derivative of a custom model in Keras?) but realised quickly that this was trying to run before I could walk so that question has been marked as a duplicate of this one.

I've tried to simplify my scenario and now have a (not custom) keras model consisting of 2 Dense layers:

inputs = tf.keras.Input((cols,), name='input')

layer_1 = tf.keras.layers.Dense(
        10,
        name='layer_1',
        input_dim=cols,
        use_bias=True,
        kernel_initializer=tf.constant_initializer(0.5),
        bias_initializer=tf.constant_initializer(0.1))(inputs)

outputs = tf.keras.layers.Dense(
        1,
        name='alpha',
        use_bias=True,
        kernel_initializer=tf.constant_initializer(0.1),
        bias_initializer=tf.constant_initializer(0))(layer_1)

model = tf.keras.Model(inputs=inputs, outputs=outputs)

prediction = model.predict(input_data)
# gradients = ...

Now I would like to know the derivative of outputs with respect to inputs for inputs = input_data.

What I've tried so far:

This answer to a different question suggests running grads = K.gradients(model.output, model.input). However, if I run that I get this error;

tf.gradients is not supported when eager execution is enabled. Use tf.GradientTape instead.

I can only assume this is something to do with eager execution now being the default.

Another approach was in the answer to my question on custom keras models, which involved adding this:

with tf.GradientTape() as tape:
    x = tf.Variable(np.random.normal(size=(10, rows, cols)), dtype=tf.float32)
    out = model(x)

What I don't understand about this approach is how I'm supposed to load the data. It requires x to be a variable, but my x is a tf.keras.Input object. I also don't understand what that with statement is doing, some kind of magic but I don't understand it.

There's a very similar-sounding question to this one here: Get Gradients with Keras Tensorflow 2.0 although the application and scenario are sufficiently different for me to have difficulty applying the answer to this scenario. It did lead me to add the following to my code:

with tf.GradientTape() as t:
    t.watch(outputs)

That does work, but now what? I run model.predict(...), but then how do I get my gradients? The answer says I should run t.gradient(outputs, x_tensor).numpy(), but what do I put in for x_tensor? I don't have an input variable. I tried running t.gradient(outputs, model.inputs) after running predict, but that resulted in this:

enter image description here

like image 938
quant Avatar asked Jan 04 '20 12:01

quant


People also ask

How do you get gradients in keras?

You can automatically retrieve the gradients of the weights of a layer by calling it inside a GradientTape . Using these gradients, you can update the weights of the layer, either manually, or using an optimizer object. Of course, you can modify the gradients before using them, if you need to.

How do you find the gradient in TensorFlow?

If you want to access the gradients that are computed for the optimizer, you can call optimizer. compute_gradients() and optimizer. apply_gradients() manually, instead of calling optimizer.

What is TF GradientTape ()?

tf. GradientTape provides hooks that give the user control over what is or is not watched. To record gradients with respect to a tf.Tensor , you need to call GradientTape.watch(x) : x = tf. constant(3.0)

Does TensorFlow have Autograd?

Behind the scenes, TensorFlow is a tensor library with automatic differentiation capability. Hence you can easily use it to solve a numerical optimization problem with gradient descent. In this post, you will learn how TensorFlow's automatic differentiation engine, autograd, works.


1 Answers

I ended up getting this to work with a variant of the answer to this question: Get Gradients with Keras Tensorflow 2.0

x_tensor = tf.convert_to_tensor(input_data, dtype=tf.float32)
with tf.GradientTape() as t:
    t.watch(x_tensor)
    output = model(x_tensor)

result = output
gradients = t.gradient(output, x_tensor)

This allows me to obtain both the output and the gradient without redundant computation.

like image 107
quant Avatar answered Sep 20 '22 15:09

quant