Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do you need to standardize inputs if you are using Batch Normalization?

I've been playing around with batch normalization in Keras. I was wondering if batch normalization also normalizes the inputs to the neural network. Does that mean I do not need to standardize my inputs to my network and rely on BN to do it?

like image 494
simeon Avatar asked Oct 08 '16 09:10

simeon


People also ask

Should you standardize and normalize?

Normalization is useful when your data has varying scales and the algorithm you are using does not make assumptions about the distribution of your data, such as k-nearest neighbors and artificial neural networks. Standardization assumes that your data has a Gaussian (bell curve) distribution.

What is not the reason for using batch normalization?

Hence the distribution of the hidden activation will also change. This change in hidden activation is known as an internal covariate shift. However, according to a study by MIT researchers, the batch normalization does not solve the problem of internal covariate shift.

What is disadvantages of batch normalization?

Disadvantages of Batch Normalization. • Difficult to estimate mean and standard deviation of input during. testing. • Cannot use batch size of 1 during training. • Computational overhead during training.

Is batch normalization used before the input layer?

Batch Norm is just another network layer that gets inserted between a hidden layer and the next hidden layer. Its job is to take the outputs from the first hidden layer and normalize them before passing them on as the input of the next hidden layer.


1 Answers

While you can certainly use it for that, batch normalization is not designed to do that and you will most likely introduce sampling error in your normalization due to the limited sample size (sample size is your batch size).

Another factor for why I would not recommend using batch normalization for normalizing your inputs is that it introduces the correction terms gamma and beta (trained parameters) which will skew your training data if not disabled.

For normalization of your test data I would recommend using z-score normalization on the complete training set (e.g., via sklearn's StandardScaler) or some appropriate alternative, but not batch normalization.

like image 158
nemo Avatar answered Sep 29 '22 23:09

nemo