Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

relu as a parameter in Dense() ( or any other layer) vs ReLu as a layer in Keras

I was just wondering if there is any significant difference between the use and speciality of

Dense(activation='relu')

and

keras.layers.ReLu

How and where the later one can be used? My best guess is in Functional API usecase but I don't know how.

like image 794
Deshwal Avatar asked Sep 11 '25 22:09

Deshwal


2 Answers

Creating some Layer instance passing the activation as parameter i.e. activation='relu' is the same as creating some Layer instance and then creating an activation e.g. Relu instance. Relu() is a layer which returns K.relu() function over inputs:

class ReLU(Layer):
.
.
.
     def call(self, inputs):
            return K.relu(inputs,
                          alpha=self.negative_slope,
                          max_value=self.max_value,
                          threshold=self.threshold)

From the Keras documentation:

Usage of activations

Activations can either be used through an Activation layer, or through the activation argument supported by all forward layers:

from keras.layers import Activation, Dense

model.add(Dense(64))
model.add(Activation('tanh'))

This is equivalent to:

model.add(Dense(64, activation='tanh'))

You can also pass an element-wise TensorFlow/Theano/CNTK function as an activation:

from keras import backend as K

model.add(Dense(64, activation=K.tanh))

Update:

Answering OP's aditional question: How and where the later one can be used?:

You can use it when you used some layer, which doesn't accept activation parameter like e.g. tf.keras.layers.Add, tf.keras.layers.Subtract etc, but you want to get a rectified output of such layers as a result:

added = tf.keras.layers.Add()([x1, x2])
relu = tf.keras.layers.ReLU(added)
like image 179
Geeocode Avatar answered Sep 13 '25 13:09

Geeocode


The most obvious use case is when you need to put a ReLU without a Dense layer, for example when implementing ResNet, the design requires a ReLU activation after summing the residual connection, like it is shown here:

x = layers.add([x, shortcut])
x = layers.Activation('relu')(x)
return x

It is also useful when you want to put a BatchNormalization layer between the pre-activation of a Dense layer and the ReLU activation. When using a GlobalAveragePooling classifier (such as in the SqueezeNet architecture), then you need to put a softmax activation after the GAP using Activation("softmax") and there are no Dense layers in the network.

There are probably more cases, these are just samples.

like image 22
Dr. Snoopy Avatar answered Sep 13 '25 13:09

Dr. Snoopy