What does "relu" stand for in tf.nn.relu?

Question

In its API documentation, it says "Computes rectified linear".

Is it Re(ctified) L(inear)... what is U then?

Phillip Bock · Accepted Answer

Re(ctified) L(inear) (U)nit

Usually a layer in a neural network has some input, say a vector, and multiplies that by a weight matrix, resulting i.e. again in a vector.

Each value in the result (usually a float) is then considered an output. However, most layers in neural networks nowadays involve nonlinearities, hence an add-on function that, you might say, adds complexity to these output values. For long these have been sigmoids and tanhs.

But more recently people use a function that results in 0 if the input is negative, and the input itself if that input is 0 or positive. This specific add-on function (or better "activation function") is called a relu.

aerin · Answer

On top of Friesel's answer, I'd like to add two important characteristics of Relu.

1. It is NOT differentiable.

Relu's graph: It's pointy, not curvy.

enter image description here

It is defined as f(x) = max(0,x) therefore it's not differentiable.

2. The derivative of ReLU is very simple! Simpler than sigmoid, which is `x(1-x)`.

The derivative of ReLU:
 1 if x > 0
 0 otherwise

It's the simplest non-linear function that we use mostly on hidden layers. Think about how easy the backpropagation would be!

What does "relu" stand for in tf.nn.relu?

Tags:

tensorflow

tensor

aerin

2 Answers

Phillip Bock

1. It is NOT differentiable.

2. The derivative of ReLU is very simple! Simpler than sigmoid, which is `x(1-x)`.

aerin

Recent Activity

Donate For Us

What does "relu" stand for in tf.nn.relu?

Tags:

tensorflow

tensor

aerin

2 Answers

Phillip Bock

1. It is NOT differentiable.

2. The derivative of ReLU is very simple! Simpler than sigmoid, which is x(1-x).

aerin

Related questions

Recent Activity

Donate For Us

2. The derivative of ReLU is very simple! Simpler than sigmoid, which is `x(1-x)`.