I'm porting my Caffe network over to TensorFlow but it doesn't seem to have xavier initialization. I'm using truncated_normal
but this seems to be making it a lot harder to train.
Xavier initialization is just sampling a (usually Gaussian) distribution where the variance is a function of the number of neurons. tf. random_normal can do that for you, you just need to compute the stddev (i.e. the number of neurons being represented by the weight matrix you're trying to initialize).
Xavier initialization is an attempt to improve the initialization of neural network weighted inputs, in order to avoid some traditional problems in machine learning. Here, the weights of the network are selected for certain intermediate values that have a benefit in machine learning application.
From the documentation: If initializer is None (the default), the default initializer passed in the variable scope will be used. If that one is None too, a glorot_uniform_initializer will be used. The glorot_uniform_initializer function initializes values from a uniform distribution.
Initializers define the way to set the initial random weights of Keras layers. The keyword arguments used for passing initializers to layers depends on the layer. Usually, it is simply kernel_initializer and bias_initializer : from tensorflow.keras import layers from tensorflow.keras import initializers layer = layers.
Since version 0.8 there is a Xavier initializer, see here for the docs.
You can use something like this:
W = tf.get_variable("W", shape=[784, 256], initializer=tf.contrib.layers.xavier_initializer())
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With