Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How should "BatchNorm" layer be used in caffe?

I am a little confused about how should I use/insert "BatchNorm" layer in my models.
I see several different approaches, for instance:

ResNets: "BatchNorm"+"Scale" (no parameter sharing)

"BatchNorm" layer is followed immediately with "Scale" layer:

layer {
    bottom: "res2a_branch1"
    top: "res2a_branch1"
    name: "bn2a_branch1"
    type: "BatchNorm"
    batch_norm_param {
        use_global_stats: true
    }
}

layer {
    bottom: "res2a_branch1"
    top: "res2a_branch1"
    name: "scale2a_branch1"
    type: "Scale"
    scale_param {
        bias_term: true
    }
}

cifar10 example: only "BatchNorm"

In the cifar10 example provided with caffe, "BatchNorm" is used without any "Scale" following it:

layer {
  name: "bn1"
  type: "BatchNorm"
  bottom: "pool1"
  top: "bn1"
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
}

cifar10 Different batch_norm_param for TRAIN and TEST

batch_norm_param: use_global_scale is changed between TRAIN and TEST phase:

layer {
  name: "bn1"
  type: "BatchNorm"
  bottom: "pool1"
  top: "bn1"
  batch_norm_param {
    use_global_stats: false
  }
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
  include {
    phase: TRAIN
  }
}
layer {
  name: "bn1"
  type: "BatchNorm"
  bottom: "pool1"
  top: "bn1"
  batch_norm_param {
    use_global_stats: true
  }
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
  include {
    phase: TEST
  }
}

So what should it be?

How should one use"BatchNorm" layer in caffe?

like image 681
Shai Avatar asked Jan 12 '17 08:01

Shai


People also ask

Where do I put batch normalization layers?

In practical coding, we add Batch Normalization after the activation function of the output layer or before the activation function of the input layer. Mostly researchers found good results in implementing Batch Normalization after the activation layer.

How does a batch normalization layer help?

Batch normalization is a technique for training very deep neural networks that standardizes the inputs to a layer for each mini-batch. This has the effect of stabilizing the learning process and dramatically reducing the number of training epochs required to train deep networks.

What are batch norm layers?

Batch Norm is a neural network layer that is now commonly used in many architectures. It often gets added as part of a Linear or Convolutional block and helps to stabilize the network during training.


2 Answers

If you follow the original paper, the Batch normalization should be followed by Scale and Bias layers (the bias can be included via the Scale, although this makes the Bias parameters inaccessible). use_global_stats should also be changed from training (False) to testing/deployment (True) - which is the default behavior. Note that the first example you give is a prototxt for deployment, so it is correct for it to be set to True.

I'm not sure about the shared parameters.

I made a pull request to improve the documents on the batch normalization, but then closed it because I wanted to modify it. And then, I never got back to it.

Note that I think lr_mult: 0 for "BatchNorm" is no longer required (perhaps not allowed?), although I'm not finding the corresponding PR now.

like image 50
Jonathan Avatar answered Sep 18 '22 05:09

Jonathan


After each BatchNorm, we have to add a Scale layer in Caffe. The reason is that the Caffe BatchNorm layer only subtracts the mean from the input data and divides by their variance, while does not include the γ and β parameters that respectively scale and shift the normalized distribution 1. Conversely, the Keras BatchNormalization layer includes and applies all of the parameters mentioned above. Using a Scale layer with the parameter “bias_term” set to True in Caffe, provides a safe trick to reproduce the exact behavior of the Keras version. https://www.deepvisionconsulting.com/from-keras-to-caffe/

like image 37
Ehsan Akbari Tabar Avatar answered Sep 18 '22 05:09

Ehsan Akbari Tabar