Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to count the amount of layers in a CNN?

The Pytorch implementation of ResNet-18. has the following structure, which appears to be 54 layers, not 18.

So why is it called "18"? How many layers does it actually have?


ResNet (
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True)
  (relu): ReLU (inplace)
  (maxpool): MaxPool2d (size=(3, 3), stride=(2, 2), padding=(1, 1), dilation=(1, 1))
  (layer1): Sequential (
    (0): BasicBlock (
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU (inplace)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True)
    )
    (1): BasicBlock (
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU (inplace)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True)
    )
  )
  (layer2): Sequential (
    (0): BasicBlock (
      (conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU (inplace)
      (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
      (downsample): Sequential (
        (0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
      )
    )
    (1): BasicBlock (
      (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU (inplace)
      (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
    )
  )
  (layer3): Sequential (
    (0): BasicBlock (
      (conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU (inplace)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
      (downsample): Sequential (
        (0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
      )
    )
    (1): BasicBlock (
      (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU (inplace)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
    )
  )
  (layer4): Sequential (
    (0): BasicBlock (
      (conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU (inplace)
      (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
      (downsample): Sequential (
        (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
      )
    )
    (1): BasicBlock (
      (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU (inplace)
      (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
    )
  )
  (avgpool): AvgPool2d (
  )
  (fc): Linear (512 -> 1000)
)
like image 482
mcgG Avatar asked Apr 03 '17 02:04

mcgG


People also ask

How can I see the number of layers in CNN?

In the simple case, the size of the output CNN layer is calculated as “input_size-(filter_size-1)”. For example, if the input image_size is (50,50) and filter is (3,3) then (50-(3–1)) = 48.

How many layers are there in CNN architecture?

What is the architecture of CNN? It has three layers namely, convolutional, pooling, and a fully connected layer. It is a class of neural networks and processes data having a grid-like topology. The convolution layer is the building block of CNN carrying the main responsibility for computation.

How many convolutional layers does CNN have?

The CNN has 4 convolutional layers, 3 max pooling layers, two fully connected layers and one softmax output layer.

How do you calculate the number of neurons in CNN?

One simple way to calculate the neurons is to simply multiply the three dimensions of that layer ( planes X width X height ): Layer 2: 27x27x128 * 2 = 186,624. Layer 3: 13x13x192 * 2 = 64,896.


2 Answers

From your output, we can know that there are 20 convolution layers (one 7x7 conv, 16 3x3 conv, and plus 3 1x1 conv for downsample). Basically, if you ignore the 1x1 conv, and counting the FC (linear) layer, the number of layers are 18.

And I've also made an example on how to visualize your architecture in pytorch via graphviz, hope it will help you understand your architecture.

like image 174
Meta Fan Avatar answered Oct 27 '22 06:10

Meta Fan


Why does ResNet-18 have 18 layers?

Well, then the answer is pretty straightforward, the number of layers in Neural Net is a hyperparameter (means you can tune it as you want). In the ResNet paper, the authors have gone through training multiple models of various layers (like 18, 34, 50) to conduct a proper study of accuracy, error rate, etc. thus the naming convention they followed is ResNet-18, ResNet-34, ResNet-50...

Why the architecture of ResNet-18 (that you've provided in your question) have more than 18 layers?

There're a number of ways people calculate the number of layers of a deep neural net model, some people count input/output layers as well, some count the pooling layers.

But the way the authors did in the ResNet paper is they just calculated all the convolution layers and the fully connected layers, nothing else. However the model architecture that you've given, there are more than 18 layers! It is simply because of the 1x1 convolution layers, the authors called them projection layers, these layers are simply used for matching input dimension (x) with residual block's dimension (F(x)) so that they can be summed (y=F(x)+x). So If you count without those projections (1x1 convs.) you'll see there are 18 layers, thus the name ResNet-18

like image 43
Khalid Saifullah Avatar answered Oct 27 '22 07:10

Khalid Saifullah