How the number of parameters associated with BatchNormalization layer is 2048?

Tags:

batch-normalization

I have the following code.

x = keras.layers.Input(batch_shape = (None, 4096))
hidden = keras.layers.Dense(512, activation = 'relu')(x)
hidden = keras.layers.BatchNormalization()(hidden)
hidden = keras.layers.Dropout(0.5)(hidden)
predictions = keras.layers.Dense(80, activation = 'sigmoid')(hidden)
mlp_model = keras.models.Model(input = [x], output = [predictions])
mlp_model.summary()

And this is the model summary:

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
input_3 (InputLayer)             (None, 4096)          0                                            
____________________________________________________________________________________________________
dense_1 (Dense)                  (None, 512)           2097664     input_3[0][0]                    
____________________________________________________________________________________________________
batchnormalization_1 (BatchNorma (None, 512)           2048        dense_1[0][0]                    
____________________________________________________________________________________________________
dropout_1 (Dropout)              (None, 512)           0           batchnormalization_1[0][0]       
____________________________________________________________________________________________________
dense_2 (Dense)                  (None, 80)            41040       dropout_1[0][0]                  
====================================================================================================
Total params: 2,140,752
Trainable params: 2,139,728
Non-trainable params: 1,024
____________________________________________________________________________________________________

The size of the input for the BatchNormalization (BN) layer is 512. According to Keras documentation, shape of the output for BN layer is same as input which is 512.

Then how the number of parameters associated with BN layer is 2048?

920

asked Mar 01 '17 00:03

2 Answers

These 2048 parameters are in fact [gamma weights, beta weights, moving_mean(non-trainable), moving_variance(non-trainable)], each having 512 elements (the size of the input layer).

190

answered Oct 07 '22 21:10

Monaj

The batch normalization in Keras implements this paper.

As you can read there, in order to make the batch normalization work during training, they need to keep track of the distributions of each normalized dimensions. To do so, since you are in mode=0by default, they compute 4 parameters per feature on the previous layer. Those parameters are making sure that you properly propagate and backpropagate the information.

So 4*512 = 2048, this should answer your question.

answered Oct 07 '22 20:10

Nassim Ben

Related questions
                            
                                Understanding stateful LSTM [closed]
                            
                                How to decode encoded data from deep autoencoder in Keras (unclarity in tutorial)
                            
                                how to save a scikit-learn pipline with keras regressor inside to disk?
                            
                                Keras - Validation Loss and Accuracy stuck at 0
                            
                                Keras for implement convolution neural network
                            
                                Convolutional neural network Conv1d input shape
                            
                                How to implement a deep bidirectional LSTM with Keras?
                            
                                Pandas DataFrame and Keras
                            
                                Save Keras ModelCheckpoints in Google Cloud Bucket
                            
                                Keras: how to output learning rate onto tensorboard
                            
                                Save and load model optimizer state
                            
                                How training and test data is split - Keras on Tensorflow
                            
                                Running the Tensorflow 2.0 code gives 'ValueError: tf.function-decorated function tried to create variables on non-first call'. What am I doing wrong?
                            
                                keras: what is the difference between model.predict and model.predict_proba
                            
                                Should I use @tf.function for all functions?
                            
                                Data Augmentation Image Data Generator Keras Semantic Segmentation
                            
                                ValueError: Output tensors to a Model must be the output of a TensorFlow `Layer`
                            
                                Using pre-trained word2vec with LSTM for word generation
                            
                                How to display custom images in TensorBoard using Keras?
                            
                                Choosing number of Steps per Epoch

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How the number of parameters associated with BatchNormalization layer is 2048?

Tags:

keras

batch-normalization

Wasi Ahmad

People also ask

2 Answers

Monaj

Nassim Ben

Recent Activity

Donate For Us