Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BatchNorm momentum convention PyTorch

Is the batchnorm momentum convention (default=0.1) correct as in other libraries e.g. Tensorflow it seems to usually be 0.9 or 0.99 by default? Or maybe we are just using a different convention?

like image 538
peter554 Avatar asked Jan 19 '18 16:01

peter554


People also ask

What is Batchnorm in PyTorch?

Pytorch batch normalization is a process of training the neural network. During training the network this layer keep guessing its computed mean and variance. Code: In the following code, we will import some libraries from which we can train the neural network and also evaluate its computed mean and variance.

What is momentum in batch normalization?

Momentum is the “lag” in learning mean and variance, so that noise due to mini-batch can be ignored. Actual(light) and lagged(bold) values with momentum 0.99 and 0.75. By default, momentum would be set a high value about 0.99, meaning high lag and slow learning. When batch sizes are small, the no.

What is BatchNorm2d?

BatchNorm2d is the number of dimensions/channels that output from the last layer and come in to the batch norm layer.


1 Answers

It seems that the parametrization convention is different in pytorch than in tensorflow, so that 0.1 in pytorch is equivalent to 0.9 in tensorflow.

To be more precise:

In Tensorflow:

running_mean = decay*running_mean + (1-decay)*new_value

In PyTorch:

running_mean = (1-decay)*running_mean + decay*new_value

This means that a value of decay in PyTorch is equivalent to a value of (1-decay) in Tensorflow.

like image 151
patapouf_ai Avatar answered Oct 01 '22 07:10

patapouf_ai