I have a SimpleRNN
like:
model.add(SimpleRNN(10, input_shape=(3, 1)))
model.add(Dense(1, activation="linear"))
The model summary says:
simple_rnn_1 (SimpleRNN) (None, 10) 120
I am curious about the parameter number 120
for simple_rnn_1
.
Could you someone answer my question?
Number of parameters in a MLP network So, 20 weights and 5 bias terms (20+5=25) are trainable parameters between this two layers. In the Similar way, we can compute the number of trainable parameters between hidden layer-1 and hidden layer-2 and also between hidden layer-2 and output layer.
The Embedding layer is defined as the first hidden layer of a network. It must specify 3 arguments: It must specify 3 arguments: input_dim: This is the size of the vocabulary in the text data.
Using the formula 4*(n+m+1)*m, number of parameters in below example would be 4 *(32 + 100 + 1) * 100= 4*133* 100 = 53200.
When you look at the headline of the table you see the title Param
:
Layer (type) Output Shape Param
===============================================
simple_rnn_1 (SimpleRNN) (None, 10) 120
This number represents the number of trainable parameters (weights and biases) in the respective layer, in this case your SimpleRNN
.
Edit:
The formula for calculating the weights is as follows:
recurrent_weights + input_weights + biases
*resp: (num_features + num_units)* num_units + num_units
Explanation:
num_units = equals the number of units in the RNN
num_features = equals the number features of your input
Now you have two things happening in your RNN.
First you have the recurrent loop, where the state is fed recurrently into the model to generate the next step. Weights for the recurrent step are:
recurrent_weights = num_units*num_units
The secondly you have new input of your sequence at each step.
input_weights = num_features*num_units
(Usually both last RNN state and new input are concatenated and then multiplied with one single weight matrix, nevertheless inputs and last RNN state use different weights)
So now we have the weights, whats missing are the biases - for every unit one bias:
biases = num_units*1
So finally we have the formula:
recurrent_weights + input_weights + biases
or
num_units* num_units + num_features* num_units + biases
=
(num_features + num_units)* num_units + biases
In your cases this means the trainable parameters are:
10*10 + 1*10 + 10 = 120
I hope this is understandable, if not just tell me - so I can edit it to make it more clear.
It might be easier to understand visually with a simple network like this:
The number of weights is 16 (4 * 4) + 12 (3 * 4) = 28 and the number of biases is 4.
where 4 is the number of units and 3 is the number of input dimensions, so the formula is just like in the first answer: num_units ^ 2 + num_units * input_dim + num_units
or simply num_units * (num_units + input_dim + 1)
, which yields 10 * (10 + 1 + 1) = 120 for the parameters given in the question.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With