Initial bias values for a neural network

Question

I am currently building a CNN in tensorflow and I am initialising my weight matrix using a He normal weight initialisation. However, I am unsure how I should initialise my bias values. I am using ReLU as my activation function between each convolutional layer. Is there a standard method to initialising bias values?

# Define approximate xavier weight initialization (with RelU correction described by He)
def xavier_over_two(shape):
    std = np.sqrt(shape[0] * shape[1] * shape[2])
    return tf.random_normal(shape, stddev=std)

def bias_init(shape):
    return #???

Yahia Zakaria · Accepted Answer

Initializing the biases. It is possible and common to initialize the biases to be zero, since the asymmetry breaking is provided by the small random numbers in the weights. For ReLU non-linearities, some people like to use small constant value such as 0.01 for all biases because this ensures that all ReLU units fire in the beginning and therefore obtain and propagate some gradient. However, it is not clear if this provides a consistent improvement (in fact some results seem to indicate that this performs worse) and it is more common to simply use 0 bias initialization.

source: http://cs231n.github.io/neural-networks-2/

Initial bias values for a neural network

Tags:

machine-learning

tensorflow

bias-neuron

Nick Bishop

1 Answers

Yahia Zakaria

Recent Activity

Donate For Us

Initial bias values for a neural network

Tags:

machine-learning

tensorflow

bias-neuron

Nick Bishop

1 Answers

Yahia Zakaria

Related questions

Recent Activity

Donate For Us