I have a <code>SimpleRNN</code> like: <pre class="prettyprint lang-py prettyprint-override"><code>model.add(SimpleRNN(10, input_shape=(3, 1))) model.add(Dense(1, activation="linear")) </code></pre> The model summary says: <pre class="prettyprint"><code>simple_rnn_1 (SimpleRNN) (None, 10) 120 </code></pre> I am curious about the parameter number <code>120</code> for <code>simple_rnn_1</code>. Could you someone answer my question?

When you look at the headline of the table you see the title <code>Param</code>: <pre class="prettyprint"><code>Layer (type) Output Shape Param =============================================== simple_rnn_1 (SimpleRNN) (None, 10) 120 </code></pre> This number represents the number of trainable parameters (weights and biases) in the respective layer, in this case your <code>SimpleRNN</code>. Edit: The formula for calculating the weights is as follows: <blockquote> recurrent_weights + input_weights + biases *resp: (num_features + num_units)* num_units + num_units </blockquote> Explanation: num_units = equals the number of units in the RNN num_features = equals the number features of your input Now you have two things happening in your RNN. First you have the recurrent loop, where the state is fed recurrently into the model to generate the next step. Weights for the recurrent step are: recurrent_weights = num_units*num_units The secondly you have new input of your sequence at each step. input_weights = num_features*num_units (Usually both last RNN state and new input are concatenated and then multiplied with one single weight matrix, nevertheless inputs and last RNN state use different weights) So now we have the weights, whats missing are the biases - for every unit one bias: biases = num_units*1 So finally we have the formula: recurrent_weights + input_weights + biases or num_units* num_units + num_features* num_units + biases = (num_features + num_units)* num_units + biases In your cases this means the trainable parameters are: 10*10 + 1*10 + 10 = 120 I hope this is understandable, if not just tell me - so I can edit it to make it more clear.

Number of parameters for Keras SimpleRNN

Tags:

machine-learning

neural-network

deep-learning

keras

recurrent-neural-network

I have a SimpleRNN like:

model.add(SimpleRNN(10, input_shape=(3, 1)))
model.add(Dense(1, activation="linear"))

The model summary says:

simple_rnn_1 (SimpleRNN)   (None, 10)   120

I am curious about the parameter number 120 for simple_rnn_1.

Could you someone answer my question?

528

asked May 02 '18 12:05

youngtackpark

2 Answers

When you look at the headline of the table you see the title Param:

Layer (type)              Output Shape   Param 
===============================================
simple_rnn_1 (SimpleRNN)   (None, 10)    120

This number represents the number of trainable parameters (weights and biases) in the respective layer, in this case your SimpleRNN.

Edit:

The formula for calculating the weights is as follows:

recurrent_weights + input_weights + biases

*resp: (num_features + num_units)* num_units + num_units

Explanation:

num_units = equals the number of units in the RNN

num_features = equals the number features of your input

Now you have two things happening in your RNN.

First you have the recurrent loop, where the state is fed recurrently into the model to generate the next step. Weights for the recurrent step are:

recurrent_weights = num_units*num_units

The secondly you have new input of your sequence at each step.

input_weights = num_features*num_units

(Usually both last RNN state and new input are concatenated and then multiplied with one single weight matrix, nevertheless inputs and last RNN state use different weights)

So now we have the weights, whats missing are the biases - for every unit one bias:

biases = num_units*1

So finally we have the formula:

recurrent_weights + input_weights + biases

num_units* num_units + num_features* num_units + biases

(num_features + num_units)* num_units + biases

In your cases this means the trainable parameters are:

10*10 + 1*10 + 10 = 120

I hope this is understandable, if not just tell me - so I can edit it to make it more clear.

144

answered Sep 18 '22 17:09

MBT

It might be easier to understand visually with a simple network like this:

enter image description here

The number of weights is 16 (4 * 4) + 12 (3 * 4) = 28 and the number of biases is 4.

where 4 is the number of units and 3 is the number of input dimensions, so the formula is just like in the first answer: num_units ^ 2 + num_units * input_dim + num_units or simply num_units * (num_units + input_dim + 1), which yields 10 * (10 + 1 + 1) = 120 for the parameters given in the question.

answered Sep 17 '22 17:09

tromgy

Related questions
                            
                                ImportError: No module named arff
                            
                                Difference in values of tf-idf matrix using scikit-learn and hand calculation
                            
                                How to handle missing NaNs for machine learning in python
                            
                                Receptive Fields on ConvNets (Receptive Field size confusion)
                            
                                Why is the x variable tensor reshaped with -1 in the MNIST tutorial for tensorflow?
                            
                                How do you compute accuracy in a regression model, after rounding predictions to classes, in keras?
                            
                                What is the exact difference between a model and an algorithm?
                            
                                Why does a binary Keras CNN always predict 1?
                            
                                Batch normalization during testing
                            
                                How to predict from saved model in Keras ?
                            
                                Port XGBoost model trained in python to another system written in C/C++
                            
                                How to go about searching for a player models in COD with OpenCV
                            
                                Gradient descent algorithm won't converge
                            
                                Identifying verb tenses in python
                            
                                what is f-measure for each class in weka
                            
                                Assertion failed (queryDescriptors.type() == trainDescCollection[0].type()) in knnMatchImpl,
                            
                                Real world examples of Machine Learning? [closed]
                            
                                How can I handle new users/items in model generated by Spark ALS from MLlib?
                            
                                How to optimize MAPE code in Python?
                            
                                scheduled sampling in Tensorflow

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With