How to use PyTorch to calculate the gradients of outputs w.r.t. the inputs in a neural network?

Tags:

I have a trained network. And I want to calculate the gradients of outputs w.r.t. the inputs. By querying the PyTorch Docs, torch.autograd.grad may be useful. So, I use the following code:

    x_test = torch.randn(D_in,requires_grad=True)
    y_test = model(x_test)
    d = torch.autograd.grad(y_test, x_test)[0]

model is the neural network. x_test is the input of size D_in and y_test is a scalar output. I want to compare the calculated result with numerical difference by scipy.misc.derivative. So, I calculated the partial derivate by setting a index.

    idx = 3
    x_test = torch.randn(D_in,requires_grad=True)
    y_test = model(x_test)
    print(x_test[idx].item())
    d = torch.autograd.grad(y_test, x_test)[0]
    print(d[idx].item())
    def fun(x):
        x_input = x_test.detach()
        x_input[idx] = x
        with torch.no_grad():
            y = model(x_input)
        return y.item()
    x0 = x_test[idx].item()
    print(x0)
    print(derivative(fun, x0, dx=1e-6))

But I got totally different results. The gradient calculated by torch.autograd.grad is -0.009522666223347187, while that by scipy.misc.derivative is -0.014901161193847656.

Is there anything wrong about the calculation? Or I use torch.autograd.grad wrongly?

444

asked Aug 03 '18 06:08

SungSingSong

1 Answers

In fact, it is very likely that your given code is completely correct. Let me explain this by redirecting you to a little background information on backpropagation, or rather in this case Automatic Differentiation (AutoDiff).

The specific implementation of many packages is based on AutoGrad, a common technique to get the exact derivatives of a function/graph. It can do this by essentially "inverting" the forward computational pass to compute piece-wise derivatives of atomic function blocks, like addition, subtraction, multiplication, division, etc., and then "chaining them together".
I explained AutoDiff and its specifics in a more detailed answer in this question.

On the contrary, scipy's derivative function is only an approximation to this derivative by using finite differences. You would take the results of the function at close-by points, and then calculate a derivative based on the difference in function values for those points. This is why you see a slight difference in the two gradients, since this can be an inaccurate representation of the actual derivative.

108

answered Oct 21 '22 12:10

dennlinger

Related questions
                            
                                Gradient Descent vs Stochastic Gradient Descent algorithms
                            
                                Extracting weights from .caffemodel without caffe installed in Python
                            
                                Function approximation Tensorflow
                            
                                TensorFlow - import meta graph and use variables from it
                            
                                Simple Feedforward Neural Network with TensorFlow won't learn
                            
                                What does nb_epoch in neural network stands for?
                            
                                Face comparison (Not recognition or detection) using OpenCV and Keras?
                            
                                Keras GRU NN KeyError when fitting : "not in index"
                            
                                Artificial intelligence that evolves in Python [closed]
                            
                                Neural network for letter recognition
                            
                                Why neural networks are failing in a simple classifcation case
                            
                                Why is dropout preventing convergence in Convolutional Neural Network?
                            
                                Keras + Tensorflow : Debug NaNs

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to use PyTorch to calculate the gradients of outputs w.r.t. the inputs in a neural network?

Tags:

gradient

neural-network

pytorch

SungSingSong

People also ask

1 Answers

dennlinger

Recent Activity

Donate For Us