I'm training a neural network on data that comes in as negative & positive values.
Is there any way to feed the data into a ReLU network without converting it all to positive and having a separate input which says if the data is negative or positive?
The problem I see is that a negative input at the input layer means that unless you have initialised your weights to be negative, the ReLU node isn't ever activated and is forever dead.
It maps the negative values to zero and maintaining positive values as it is.
Key among the limitations of ReLU is the case where large weight updates can mean that the summed input to the activation function is always negative, regardless of the input to the network. This means that a node with this problem will forever output an activation value of 0.0. This is referred to as a “dying ReLU“.
ReLU is the max function(x,0) with input x e.g. matrix from a convolved image. ReLU then sets all negative values in the matrix x to zero and all other values are kept constant. ReLU is computed after the convolution and is a nonlinear activation function like tanh or sigmoid.
ReLU can Solve the Vanishing Gradient Problem If the input value is positive, the ReLU function returns it; if it is negative, it returns 0. The ReLU's derivative is 1 for values larger than zero. Because multiplying 1 by itself several times still gives 1, this basically addresses the vanishing gradient problem.
So naturally if a NN with activation function ReLu is fed 0 or negative inputs, those neurons will become dead. Now my data contains several features with 0 and negative values. What to do in such a case? Should I use LeakyReLu or some other variation of ReLu? Or should I transform my data such that only positive values remain?
According to equation 1, the output of ReLu is the maximum value between zero and the input value. An output is equal to zero when the input value is negative and the input value when the input is positive. Thus, we can rewrite equation 1 as follows:
Without data scaling on many problems, the weights of the neural network can grow large, making the network unstable and increasing the generalization error. This good practice of scaling inputs applies whether using ReLU for your network or not. By design, the output from ReLU is unbounded in the positive domain.
ReLu is a non-linear activation function that is used in multi-layer neural networks or deep neural networks. This function can be represented as: where x = an input value. According to equation 1, the output of ReLu is the maximum value between zero and the input value. An output is equal to zero when the input value is negative and the input ...
I'm not really 100% sure what you're asking, as there are many activation functions and you can easy code your own. If you dont want to code your own, maybe try some alternatives:
Leaky ReLU
Parameteric ReLU
Basically, take a look here
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With