How to initialize weights when using RELU activation function

1 Answers

I'm not sure there is a hard and fast best way to initialize weights and bias for a ReLU layer.

Some claim that (a slightly modified version of) Xavier initialization works well with ReLUs. Others that small Gaussian random weights plus bias=1 (ensuring the weighted sum of positive inputs will remain positive and thus not end up in the ReLUs zero region).

In Theano, these can be achieved like this (assuming weights post-multiply the input):

w = theano.shared((numpy.random.randn((in_size, out_size)) * 0.1).astype(theano.config.floatX))
b = theano.shared(numpy.ones(out_size))

w = theano.shared((numpy.random.randn((in_size, out_size)) * tt.sqrt(2 / (in_size + out_size))).astype(theano.config.floatX))
b = theano.shared(numpy.zeros(out_size))

answered Sep 23 '22 23:09

Daniel Renshaw

Related questions
                            
                                What is data type for Python Keras deep learning package?
                            
                                Theano stack matrices programmatically?
                            
                                Neural Networks: Understanding theano Library
                            
                                How to connect some nodes directly to the output layer in Keras
                            
                                How to install theano with python3.6?
                            
                                pymc3: parallel computing with njobs>1 vs. GPU
                            
                                Theano Reshaping
                            
                                Visualize output of each layer in theano Convolutional MLP
                            
                                How to use the vgg-net when I load vgg16_weights.h5?
                            
                                Accessing neural network weights and neuron activations
                            
                                Set last non-zero element of each row to zero - NumPy
                            
                                how to update scan Cython code in Theano?
                            
                                Converting a theano model built on GPU to CPU?
                            
                                Theano/Lasagne/Nolearn Neural Network Image Input

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to initialize weights when using RELU activation function

Tags:

conv-neural-network

theano

Giovanni Crescencio

People also ask

1 Answers

Daniel Renshaw

Recent Activity

Donate For Us