What is the class definition of nn.Linear in PyTorch?

Tags:

I have the following code for PyTorch:

import torch.nn as nn import torch.nn.functional as F  class Network(nn.Module):     def __init__(self):         super().__init__()         self.hidden = nn.Linear(784, 256)         self.output = nn.Linear(256, 10)          def forward(self, x):         x = F.sigmoid(self.hidden(x))         x = F.softmax(self.output(x), dim=1)              return x

My question: What is this self.hidden?

It returns from nn.Linear and it can take x as argument. What exactly is the purpose of self.hidden?

342

asked Feb 27 '19 23:02

jason

2 Answers

What is the class definition of nn.Linear in pytorch?

From documentation:

CLASS torch.nn.Linear(in_features, out_features, bias=True)

Applies a linear transformation to the incoming data: y = x*W^T + b

Parameters:

in_features – size of each input sample (i.e. size of x)
out_features – size of each output sample (i.e. size of y)
bias – If set to False, the layer will not learn an additive bias. Default: True

Note that the weights W have shape (out_features, in_features) and biases b have shape (out_features). They are initialized randomly and can be changed later (e.g. during the training of a Neural Network they are updated by some optimization algorithm).

In your Neural Network, the self.hidden = nn.Linear(784, 256) defines a hidden (meaning that it is in between of the input and output layers), fully connected linear layer, which takes input x of shape (batch_size, 784), where batch size is the number of inputs (each of size 784) which are passed to the network at once (as a single tensor), and transforms it by the linear equation y = x*W^T + b into a tensor y of shape (batch_size, 256). It is further transformed by the sigmoid function, x = F.sigmoid(self.hidden(x)) (which is not a part of the nn.Linear but an additional step).

Let's see a concrete example:

import torch import torch.nn as nn  x = torch.tensor([[1.0, -1.0],                   [0.0,  1.0],                   [0.0,  0.0]])  in_features = x.shape[1]  # = 2 out_features = 2  m = nn.Linear(in_features, out_features)

where x contains three inputs (i.e. the batch size is 3), x[0], x[1] and x[3], each of size 2, and the output is going to be of shape (batch size, out_features) = (3, 2).

The values of the parameters (weights and biases) are:

>>> m.weight tensor([[-0.4500,  0.5856],         [-0.1807, -0.4963]])  >>> m.bias tensor([ 0.2223, -0.6114])

(because they were initialized randomly, most likely you will get different values from the above)

The output is:

>>> y = m(x) tensor([[-0.8133, -0.2959],         [ 0.8079, -1.1077],         [ 0.2223, -0.6114]])

and (behind the scenes) it is computed as:

y = x.matmul(m.weight.t()) + m.bias  # y = x*W^T + b

i.e.

y[i,j] == x[i,0] * m.weight[j,0] + x[i,1] * m.weight[j,1] + m.bias[j]

where i is in interval [0, batch_size) and j in [0, out_features).

answered Oct 06 '22 19:10

Andreas K.

The Network defined as having two layers, hidden and output. Roughly speaking, the function of the hidden layer is to hold parameters you can optimize during training.

answered Oct 06 '22 19:10

Sergii Dymchenko

Related questions
                            
                                VScode formatter, keep open bracket at same line (PHP)
                            
                                How to setup pip to download from mirror repository by default?
                            
                                dropout(): argument 'input' (position 1) must be Tensor, not str when using Bert with Huggingface
                            
                                Resources for image distortion algorithms
                            
                                Why am I getting the following error in Python "ImportError: No module named py"?
                            
                                How to get argument names using reflection
                            
                                How to detect whether an OS X application is already launched
                            
                                How to fire and forget a subprocess?
                            
                                Changing background color in vim at a certain column
                            
                                Stretching controls to fill ItemsControl
                            
                                Rails: Custom ordering of records
                            
                                How to sort by annotated Count() in a related model in Django

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With