How does pytorch compute the gradients for a simple linear regression model?

Question

I am using pytorch and trying to understand how a simple linear regression model works.

I'm using a simple LinearRegressionModel class:

class LinearRegressionModel(nn.Module):
    def __init__(self, input_dim, output_dim):
        super(LinearRegressionModel, self).__init__()
        self.linear = nn.Linear(input_dim, output_dim)  

    def forward(self, x):
        out = self.linear(x)
        return out

model = LinearRegressionModel(1, 1)

Next I instantiate a loss criterion and an optimizer

criterion = nn.MSELoss()

optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

Finally to train the model I use the following code:

for epoch in range(epochs):
    if torch.cuda.is_available():
        inputs = Variable(torch.from_numpy(x_train).cuda())

    if torch.cuda.is_available():
        labels = Variable(torch.from_numpy(y_train).cuda())

    # Clear gradients w.r.t. parameters
    optimizer.zero_grad() 

    # Forward to get output
    outputs = model(inputs)

    # Calculate Loss
    loss = criterion(outputs, labels)

    # Getting gradients w.r.t. parameters
    loss.backward()

    # Updating parameters
    optimizer.step()

My question is how does the optimizer get the loss gradient, computed by loss.backward(), to update the parameters using the step() method? How are the model, the loss criterion and the optimizer tied together?

Vishnu Subramanian · Accepted Answer

PyTorch has this concept of tensors and variables. When you use nn.Linear the function creates 2 variables namely W and b.In pytorch a variable is a wrapper that encapsulates a tensor , its gradient and information about its create function. you can directly access the gradients by

w.grad

When you try it before calling the loss.backward() you get None. Once you call the loss.backward() it will contain now gradients. Now you can update these gradient manually with the below simple steps.

w.data -= learning_rate * w.grad.data

When you have a complex network ,the above simple step could grow complex. So optimisers like SGD , Adam takes care of this. When you create the object for these optimisers we pass in the parameters of our model. nn.Module contains this parameters() function which will return all the learnable parameters to the optimiser. Which can be done using the below step.

optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

catethos · Answer

loss.backward()

calculates the gradients and store them in the parameters. And you pass in the paremeters that are needed to be tuned here:

optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

How does pytorch compute the gradients for a simple linear regression model?

Tags:

python

gradient

neural-network

pytorch

regression

Dimitris Poulopoulos

2 Answers

Vishnu Subramanian

catethos

Recent Activity

Donate For Us

How does pytorch compute the gradients for a simple linear regression model?

Tags:

python

gradient

neural-network

pytorch

regression

Dimitris Poulopoulos

2 Answers

Vishnu Subramanian

catethos

Related questions

Recent Activity

Donate For Us