Please excuse the novice question, but is Module
just the same as saying model
?
That's what it sounds like, when the documentation says:
Whenever you want a model more complex than a simple sequence of existing Modules you will need to define your model (as a custom
Module
subclass).
Or... when they mention Module
, are they referring to something more formal and computer-sciency, like a protocol / interface type thing?
Model. In PyTorch, a model is represented by a regular Python class that inherits from the Module class. The most fundamental methods it needs to implement are: __init__(self) : it defines the parts that make up the model —in our case, two parameters, a and b.
model.children() is a generator that returns layers of the model from which you can extract your parameter tensors using <layername>.weight or <layername>.bias.
train() tells your model that you are training the model. This helps inform layers such as Dropout and BatchNorm, which are designed to behave differently during training and evaluation.
PyTorch provides the torch. nn module to help us in creating and training of the neural network. We will first train the basic neural network on the MNIST dataset without using any features from these models.
It's a simple container.
From the docs of nn.Module
Base class for all neural network modules. Your models should also subclass this class. Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes. Submodules assigned in this way will be registered, and will have their parameters converted too when you call
.cuda()
, etc.
From the tutorial:
All network components should inherit from nn.Module and override the forward() method. That is about it, as far as the boilerplate is concerned. Inheriting from nn.Module provides functionality to your component. For example, it makes it keep track of its trainable parameters, you can swap it between CPU and GPU with the .to(device) method, where device can be a CPU device torch.device("cpu") or CUDA device torch.device("cuda:0").
A module is a container from which layers, model subparts (e.g. BasicBlock
in resnet
in torchvision
) and models should inherit. Why should they? Because the inheritance from nn.Module
allows you to call methods like to("cuda:0")
, .eval()
, .parameters()
or register hooks easily.
That's an API design choice and I find having only a Module
class instead of two separate Model
and Layers
to be cleaner and to allow more freedom (it's easier to send just a part of the model to GPU, to get parameters only for some layers...).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With