Details about alpha in tf.nn.leaky_relu( features, alpha=0.2, name=None )

Question

I am trying to use leaky_relu as my activation function for hidden layers. For parameter alpha, it is explained as:

slope of the activation function at x < 0

What does this means? What effect will the different values of alpha have on the results of the model?

David · Accepted Answer

A deep explanation regarding ReLU and its variant is present in the following links:

https://machinelearningmastery.com/rectified-linear-activation-function-for-deep-learning-neural-networks/
https://medium.com/@himanshuxd/activation-functions-sigmoid-relu-leaky-relu-and-softmax-basics-for-neural-networks-and-deep-8d9c70eed91e

In regular ReLU the main drawback is the fact that the input for the activation can be negative, due to operation performed in the network causing to what is referred as "Dying RELU" problem

the gradient is 0 whenever the unit is not active. This could lead to cases where a unit never activates as a gradient-based optimization algorithm will not adjust the weights of a unit that never activates initially. Further, like the vanishing gradients problem, we might expect learning to be slow when training ReLU networks with constant 0 gradients.

So Leaky ReLU substitutes zero values with some small value say 0.001 (referred as “alpha”). So, for leaky ReLU, the function f(x) = max(0.001x, x). Now gradient descent of 0.001x will be having a non-zero value and it will continue learning without reaching dead end.

Details about alpha in tf.nn.leaky_relu( features, alpha=0.2, name=None )

Tags:

python

neural-network

tensorflow

deep-learning

activation-function

Anthony0202

1 Answers

David

Recent Activity

Donate For Us

Details about alpha in tf.nn.leaky_relu( features, alpha=0.2, name=None )

Tags:

python

neural-network

tensorflow

deep-learning

activation-function

Anthony0202

1 Answers

David

Related questions

Recent Activity

Donate For Us