Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between Dense and Activation layer in Keras

I was wondering what was the difference between Activation Layer and Dense layer in Keras.

Since Activation Layer seems to be a fully connected layer, and Dense have a parameter to pass an activation function, what is the best practice ?

Let's imagine a fictionnal network like this : Input -> Dense -> Dropout -> Final Layer Final Layer should be : Dense(activation=softmax) or Activation(softmax) ? What is the cleanest and why ?

Thanks everyone!

like image 223
Pusheen_the_dev Avatar asked Nov 29 '16 12:11

Pusheen_the_dev


People also ask

What is dense and activation?

Dense layer is the regular deeply connected neural network layer. It is most common and frequently used layer. Dense layer does the below operation on the input and return the output. output = activation(dot(input, kernel) + bias)

What is activation layer in keras?

keras. activations. elu function to ensure a slope larger than one for positive inputs. The values of alpha and scale are chosen so that the mean and variance of the inputs are preserved between two consecutive layers as long as the weights are initialized correctly (see tf.

What is activation function in dense layer?

Activation functions are a critical part of the design of a neural network. The choice of activation function in the hidden layer will control how well the network model learns the training dataset. The choice of activation function in the output layer will define the type of predictions the model can make.

Is dense layer same as linear layer?

Yes, it is the same. model. add (Dense(10, activation = None)) or nn. linear(128, 10) is the same, because it is not activated in both, therefore if you don't specify anything, no activation is applied.


2 Answers

Using Dense(activation=softmax) is computationally equivalent to first add Dense and then add Activation(softmax). However there is one advantage of the second approach - you could retrieve the outputs of the last layer (before activation) out of such defined model. In the first approach - it's impossible.

like image 161
Marcin Możejko Avatar answered Sep 20 '22 15:09

Marcin Możejko


As @MarcinMożejko said, it is equivalent. I just want to explain why. If you look at the Dense Keras documentation page, you'll see that the default activation function is None.

A dense layer mathematically is:

a = g(W.T*a_prev+b) 

where g an activation function. When using Dense(units=k, activation=softmax), it is computing all the quantities in one shot. When doing Dense(units=k) and then Activation('softmax), it first calculates the quantity, W.T*a_prev+b (because the default activation function is None) and then applying the activation function specified as input to the Activation layer to the calculated quantity.

like image 21
Francesco Boi Avatar answered Sep 19 '22 15:09

Francesco Boi