Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ReLU derivative in backpropagation

I am about making backpropagation on a neural network that uses ReLU. In a previous project of mine, I did it on a network that was using Sigmoid activation function, but now I'm a little bit confused, since ReLU doesn't have a derivative.

Here's an image about how weight5 contributes to the total error. In this example, out/net = a*(1 - a) if I use sigmoid function.

What should I write instead of "a*(1 - a)" to make the backpropagation work?

like image 809
Gergely Papp Avatar asked Feb 04 '17 16:02

Gergely Papp


People also ask

How does ReLU work in backpropagation?

Generally: A ReLU is a unit that uses the rectifier activation function. That means it works exactly like any other hidden layer but except tanh(x), sigmoid(x) or whatever activation you use, you'll instead use f(x) = max(0,x).

What is the derivative of a ReLU function?

ReLU is differentiable at all the point except 0. the left derivative at z = 0 is 0 and the right derivative is 1.

What is a derivative of the activation function used for in backpropagation?

The derivative (df(e)/de) is used by the optimization technique for locating the minima of the loss function. A large value of derivative results in a large adjustment in the corresponding weight.

Is ReLU activation function differentiable at origin?

The ReLU-function is not differentiable at the origin, so according to my understanding the backpropagation algorithm (BPA) is not suitable for training a neural network with ReLUs, since the chain rule of multivariable calculus refers to smooth functions only.


1 Answers

since ReLU doesn't have a derivative.

No, ReLU has derivative. I assumed you are using ReLU function f(x)=max(0,x). It means if x<=0 then f(x)=0, else f(x)=x. In the first case, when x<0 so the derivative of f(x) with respect to x gives result f'(x)=0. In the second case, it's clear to compute f'(x)=1.

like image 154
malioboro Avatar answered Oct 06 '22 11:10

malioboro