Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what is the difference between using softmax as a sequential layer in tf.keras and softmax as an activation function for a dense layer?

what is the difference between using softmax as a sequential layer in tf.keras and softmax as an activation function for a dense layer?

tf.keras.layers.Dense(10, activation=tf.nn.softmax)

and

tf.keras.layers.Softmax(10)
like image 941
Pavan elisetty Avatar asked Sep 28 '20 05:09

Pavan elisetty


People also ask

What is Softmax in keras?

Softmax is often used as the activation for the last layer of a classification network because the result could be interpreted as a probability distribution. The softmax of each vector x is computed as exp(x) / tf. reduce_sum(exp(x)) . The input values in are the log-odds of the resulting probability. Arguments.

What is activation function in dense layer?

Activation functions are a critical part of the design of a neural network. The choice of activation function in the hidden layer will control how well the network model learns the training dataset. The choice of activation function in the output layer will define the type of predictions the model can make.

What does Softmax activation do?

Softmax is a mathematical function that converts a vector of numbers into a vector of probabilities, where the probabilities of each value are proportional to the relative scale of each value in the vector.

What does tf keras layers dense do?

Keras dense layer on the output layer performs dot product of input tensor and weight kernel matrix. A bias vector is added and element-wise activation is performed on output values.


1 Answers

they are the same, you can test it on your own

# generate data
x = np.random.uniform(0,1, (5,20)).astype('float32')

# 1st option
X = Dense(10, activation=tf.nn.softmax)
A = X(x)

# 2nd option
w,b = X.get_weights()
B = Softmax()(tf.matmul(x,w) + b)

tf.reduce_all(A == B)
# <tf.Tensor: shape=(), dtype=bool, numpy=True>

Pay attention also when using tf.keras.layers.Softmax, it doesn't require to specify the units, it's a simple activation

by default, the softmax is computed on the -1 axis, you can change this if you have tensor outputs > 2D and want to operate softmax on other dimensionalities. You can change this easily in the second option

like image 77
Marco Cerliani Avatar answered Oct 28 '22 18:10

Marco Cerliani