Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is right batch normalization function in Tensorflow?

Tags:

In tensorflow 1.4, I found two functions that do batch normalization and they look same:

  1. tf.layers.batch_normalization (link)
  2. tf.contrib.layers.batch_norm (link)

Which function should I use? Which one is more stable?

like image 830
KimHee Avatar asked Dec 28 '17 04:12

KimHee


People also ask

What is batch normalization in Tensorflow?

Batch normalization applies a transformation that maintains the mean output close to 0 and the output standard deviation close to 1. Importantly, batch normalization works differently during training and during inference.

What is the purpose of batch normalization?

Batch normalization solves a major problem called internal covariate shift. It helps by making the data flowing between intermediate layers of the neural network look, this means you can use a higher learning rate. It has a regularizing effect which means you can often remove dropout.

What does Tensorflow normalize do?

Tensorflow normalize is the method available in the tensorflow library that helps to bring out the normalization process for tensors in neural networks. The main purpose of this process is to bring the transformation so that all the features work on the same or similar level of scale.

What is batch normalization formula?

The basic formula is x* = (x - E[x]) / sqrt(var(x)) , where x* is the new value of a single component, E[x] is its mean within a batch and var(x) is its variance within a batch. BN extends that formula further to x** = gamma * x* + beta , where x** is the final normalized value. gamma and beta are learned per layer.


2 Answers

Just to add to the list, there're several more ways to do batch-norm in tensorflow:

  • tf.nn.batch_normalization is a low-level op. The caller is responsible to handle mean and variance tensors themselves.
  • tf.nn.fused_batch_norm is another low-level op, similar to the previous one. The difference is that it's optimized for 4D input tensors, which is the usual case in convolutional neural networks. tf.nn.batch_normalization accepts tensors of any rank greater than 1.
  • tf.layers.batch_normalization is a high-level wrapper over the previous ops. The biggest difference is that it takes care of creating and managing the running mean and variance tensors, and calls a fast fused op when possible. Usually, this should be the default choice for you.
  • tf.contrib.layers.batch_norm is the early implementation of batch norm, before it's graduated to the core API (i.e., tf.layers). The use of it is not recommended because it may be dropped in the future releases.
  • tf.nn.batch_norm_with_global_normalization is another deprecated op. Currently, delegates the call to tf.nn.batch_normalization, but likely to be dropped in the future.
  • Finally, there's also Keras layer keras.layers.BatchNormalization, which in case of tensorflow backend invokes tf.nn.batch_normalization.
like image 113
Maxim Avatar answered Sep 28 '22 17:09

Maxim


As show in doc, tf.contrib is a contribution module containing volatile or experimental code. When function is complete, it will be removed from this module. Now there are two, in order to be compatible with the historical version.

So, the former tf.layers.batch_normalization is recommended.

like image 33
dxf Avatar answered Sep 28 '22 18:09

dxf