I'm building CNN + Ensemble model for classify images with Tensorflow at Python. I crawled dog and cat images at google images. Then changed them to 126 * 126 pixel size and gray scale, add label 0 to dog, 1 to cat. CNN has 5 conv layer and 2 fc layer. HE, PReLU, max-pooling, drop-out, Adam are used in model. When Parameter Tuning finished, I added Early-Stopping, the model learned 65~70 epoch, finished with 92.5~92.7% accuracy. After learning finished, I want change my CNN model to VGG network, I checked my CNN parameter, shockingly, I found I didn't add Bias at conv layer. 2 fc layer had Bias but 5 conv layer didn't have Bias. So I added Bias at 5 conv layer, BUT my model could not learn. Cost increased to infinite.
Bias is not necessarily at Deep Convolution Layer?
Parameters of a Convolutional Layer There is one bias for each output channel. Each bias is added to every element in that output channel. Note that the bias computation was not shown in the above figures, and are often omitted in other texts describing convolutional arithmetics. Nevertheless, the biases are there.
Bias allows you to shift the activation function by adding a constant (i.e. the given bias) to the input. Bias in Neural Networks can be thought of as analogous to the role of a constant in a linear function, whereby the line is effectively transposed by the constant value.
A bias vector is an additional set of weights in a neural network that require no input, and this it corresponds to the output of an artificial neural network when it has zero input. Bias represents an extra neuron included with each pre-output layer and stores the value of “1,” for each action.
Since in CNN, we are taking one filter to indicate one feature. We introduce a variable(b) to incorporate the bias from that particular filter. Hence, each filter takes into account the bias that it can cause. The bias is not due to individual filter weights but the whole filter itself.
How did you add your bias to the convolutional layer? There are two ways to do this: Tied biases which share one bias per kernel and untied biases which use one bias per kernel and output. Also read this.
Regarding your question whether or not they are necessary, the answer is no. Biases in convolutional layers increase the capacity of your model, making it theoretically able to represent more complex data. If your model however already has the capacity to do this, they are not necessary.
An example is this implementation of the 152 layer ResNet architecture where the convolution layers have no bias. Instead the bias is added in the subsequent batch normalization layers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With