Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Xavier and he_normal initialization difference

What is the difference between He normal and Xavier normal initializer in keras. Both seem to initialize weights based on variance in the input data. Any intuitive explanation for the difference between both?

like image 699
AKSHAYAA VAIDYANATHAN Avatar asked Feb 06 '18 10:02

AKSHAYAA VAIDYANATHAN


People also ask

What is the Xavier initialization?

Xavier initialization is an attempt to improve the initialization of neural network weighted inputs, in order to avoid some traditional problems in machine learning. Here, the weights of the network are selected for certain intermediate values that have a benefit in machine learning application.

What is He_Normal initializer?

He_Normal initializer takes samples from a truncated normal distribution centered on 0 with stddev = sqrt(2 / fan_in) where fan_in is the number of input units in the weight tensor.

Why do we use Xavier initialization?

The goal of Xavier Initialization is to initialize the weights such that the variance of the activations are the same across every layer. This constant variance helps prevent the gradient from exploding or vanishing.

What is Glorot normal initialization?

initializers. glorot_normal . Draws samples from a truncated normal distribution centered on 0 with stddev = sqrt(2 / (fan_in + fan_out)) where fan_in is the number of input units in the weight tensor and fan_out is the number of output units in the weight tensor.


1 Answers

See this discussion on Stats.SE:

In summary, the main difference for machine learning practitioners is the following:

  • He initialization works better for layers with ReLu activation.
  • Xavier initialization works better for layers with sigmoid activation.
like image 193
Maxim Avatar answered Nov 13 '22 04:11

Maxim