I was wondering what was the difference between Activation Layer and Dense layer in Keras. Since Activation Layer seems to be a fully connected layer, and Dense have a parameter to pass an activation function, what is the best practice ? Let's imagine a fictionnal network like this : Input -> Dense -> Dropout -> Final Layer Final Layer should be : Dense(activation=softmax) or Activation(softmax) ? What is the cleanest and why ? Thanks everyone!

Using <code>Dense(activation=softmax)</code> is computationally equivalent to first add <code>Dense</code> and then add <code>Activation(softmax)</code>. However there is one advantage of the second approach - you could retrieve the outputs of the last layer (before activation) out of such defined model. In the first approach - it's impossible.

As @MarcinMożejko said, it is equivalent. I just want to explain why. If you look at the <code>Dense</code> Keras documentation page, you'll see that the default activation function is <code>None</code>. A dense layer mathematically is: <pre class="prettyprint"><code>a = g(W.T*a_prev+b) </code></pre> where <code>g</code> an activation function. When using <code>Dense(units=k, activation=softmax)</code>, it is computing all the quantities in one shot. When doing <code>Dense(units=k)</code> and then Activation('softmax), it first calculates the quantity, <code>W.T*a_prev+b</code> (because the default activation function is <code>None</code>) and then applying the activation function specified as input to the <code>Activation</code> layer to the calculated quantity.

Difference between Dense and Activation layer in Keras

2 Answers

Using Dense(activation=softmax) is computationally equivalent to first add Dense and then add Activation(softmax). However there is one advantage of the second approach - you could retrieve the outputs of the last layer (before activation) out of such defined model. In the first approach - it's impossible.

161

answered Sep 20 '22 15:09

Marcin Możejko

As @MarcinMożejko said, it is equivalent. I just want to explain why. If you look at the Dense Keras documentation page, you'll see that the default activation function is None.

A dense layer mathematically is:

a = g(W.T*a_prev+b)

where g an activation function. When using Dense(units=k, activation=softmax), it is computing all the quantities in one shot. When doing Dense(units=k) and then Activation('softmax), it first calculates the quantity, W.T*a_prev+b (because the default activation function is None) and then applying the activation function specified as input to the Activation layer to the calculated quantity.

answered Sep 19 '22 15:09

Francesco Boi

Related questions
                            
                                Python SyntaxError: ("'return' with argument inside generator",)
                            
                                How do I run graphx with Python / pyspark?
                            
                                Convert large csv to hdf5
                            
                                Tracking model changes in SQLAlchemy
                            
                                Get a value from solution set returned as finiteset by Sympy
                            
                                ValueError: all the input arrays must have same number of dimensions
                            
                                Showing ValueError: shapes (1,3) and (1,3) not aligned: 3 (dim 1) != 1 (dim 0)
                            
                                Converting a geopandas geodataframe into a pandas dataframe
                            
                                TypeError: cannot unpack non-iterable NoneType object
                            
                                How can I hide the console window when freezing wxPython applications with cxFreeze?
                            
                                Efficient Python to Python IPC [closed]
                            
                                Clojure equivalent to Python's "any" and "all" functions?
                            
                                Interpolation on DataFrame in pandas
                            
                                Pythonic way to access arbitrary element from dictionary [duplicate]
                            
                                Python module to change system date and time
                            
                                What is the difference between np.sum and np.add.reduce?
                            
                                How to store dictionary in HDF5 dataset
                            
                                Control the pip version in virtualenv
                            
                                Mocking a subprocess call in Python
                            
                                How to set request args with Flask test_client?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Difference between Dense and Activation layer in Keras

Tags:

python

machine-learning

neural-network

deep-learning

keras

Pusheen_the_dev

People also ask

2 Answers

Marcin Możejko

Francesco Boi

Recent Activity

Donate For Us