How to count the amount of layers in a CNN?

Tags:

The Pytorch implementation of ResNet-18. has the following structure, which appears to be 54 layers, not 18.

So why is it called "18"? How many layers does it actually have?


ResNet (
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True)
  (relu): ReLU (inplace)
  (maxpool): MaxPool2d (size=(3, 3), stride=(2, 2), padding=(1, 1), dilation=(1, 1))
  (layer1): Sequential (
    (0): BasicBlock (
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU (inplace)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True)
    )
    (1): BasicBlock (
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU (inplace)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True)
    )
  )
  (layer2): Sequential (
    (0): BasicBlock (
      (conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU (inplace)
      (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
      (downsample): Sequential (
        (0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
      )
    )
    (1): BasicBlock (
      (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU (inplace)
      (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
    )
  )
  (layer3): Sequential (
    (0): BasicBlock (
      (conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU (inplace)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
      (downsample): Sequential (
        (0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
      )
    )
    (1): BasicBlock (
      (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU (inplace)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
    )
  )
  (layer4): Sequential (
    (0): BasicBlock (
      (conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU (inplace)
      (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
      (downsample): Sequential (
        (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
      )
    )
    (1): BasicBlock (
      (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU (inplace)
      (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
    )
  )
  (avgpool): AvgPool2d (
  )
  (fc): Linear (512 -> 1000)
)

482

asked Apr 03 '17 02:04

mcgG

2 Answers

From your output, we can know that there are 20 convolution layers (one 7x7 conv, 16 3x3 conv, and plus 3 1x1 conv for downsample). Basically, if you ignore the 1x1 conv, and counting the FC (linear) layer, the number of layers are 18.

And I've also made an example on how to visualize your architecture in pytorch via graphviz, hope it will help you understand your architecture.

174

answered Oct 27 '22 06:10

Meta Fan

Why does ResNet-18 have 18 layers?

Well, then the answer is pretty straightforward, the number of layers in Neural Net is a hyperparameter (means you can tune it as you want). In the ResNet paper, the authors have gone through training multiple models of various layers (like 18, 34, 50) to conduct a proper study of accuracy, error rate, etc. thus the naming convention they followed is ResNet-18, ResNet-34, ResNet-50...

Why the architecture of ResNet-18 (that you've provided in your question) have more than 18 layers?

There're a number of ways people calculate the number of layers of a deep neural net model, some people count input/output layers as well, some count the pooling layers.

But the way the authors did in the ResNet paper is they just calculated all the convolution layers and the fully connected layers, nothing else. However the model architecture that you've given, there are more than 18 layers! It is simply because of the 1x1 convolution layers, the authors called them projection layers, these layers are simply used for matching input dimension (x) with residual block's dimension (F(x)) so that they can be summed (y=F(x)+x). So If you count without those projections (1x1 convs.) you'll see there are 18 layers, thus the name ResNet-18

answered Oct 27 '22 07:10

Khalid Saifullah

Related questions
                            
                                Are modern CNN (convolutional neural network) as DetectNet rotate invariant?
                            
                                is it possible to implement dynamic class weights in keras?
                            
                                pytorch embedding index out of range
                            
                                How to serialize/deserialized pybrain networks?
                            
                                Compiling Caffe C++ Classification Example
                            
                                Predictions using a Keras Recurrent Neural Network - accuracy is always 1.0
                            
                                Generating LMDB for Caffe
                            
                                TensorFlow or Theano: how do they know the loss function derivative based on the neural network graph?
                            
                                Keras: How to feed input directly into other hidden layers of the neural net than the first?
                            
                                How to set initial weights in MLPClassifier?
                            
                                Keras LSTM - why different results with "same" model & same weights?
                            
                                make pycaffe fatal error: 'Python.h' file not found
                            
                                Tensorflow reshape tensor
                            
                                Multi GPU architecture, gradient averaging - less accurate model?
                            
                                Why do we call the fully connected layers in CNN "the Top Layers"?
                            
                                What is the difference between MLP implementation from scratch and in PyTorch?
                            
                                Keras' Sequential vs Functional API for Multi-Task Learning Neural Network
                            
                                neural networks regression using pybrain
                            
                                How to store neural network knowledge data?
                            
                                Python/Keras/Theano wrong dimensions for Deep Autoencoder

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to count the amount of layers in a CNN?

Tags:

neural-network

deep-learning

pytorch

resnet

deep-residual-networks

mcgG

People also ask

2 Answers

Meta Fan

Khalid Saifullah

Recent Activity

Donate For Us