When I am running this code with Keras:
networkDrive = Input(batch_shape=(1,length,1))
network = SimpleRNN(3, activation='tanh', stateful=False, return_sequences=True)(networkDrive)
generatorNetwork = Model(networkDrive, network)
predictions = generatorNetwork.predict(noInput, batch_size=length)
print(np.array(generatorNetwork.layers[1].get_weights()))
I am getting this output
[array([[ 0.91814435, 0.2490257 , 1.09242284]], dtype=float32)
array([[-0.42028981, 0.68996912, -0.58932084],
[-0.88647962, -0.17359462, 0.42897415],
[ 0.19367599, 0.70271438, 0.68460363]], dtype=float32)
array([ 0., 0., 0.], dtype=float32)]
I suppose, that the (3,3) Matrix is the weight matrix, connecting the RNN Units with each other, and one of the two arrays probably is the bias But what is the third?
In simpleRNN implementation there are indeed 3 sets of weights needed.
weights[0]
is the input matrix. It transforms the input and therefore has a shape [input_dim, output_dim]
weights[1]
is the recurent matrix. It transforms the recurrent state and has a shape [output_dim, output_dim]
weights[2]
is the bias matrix. It is added to the output and has a shape [output_dim]
the results of the three operations are summed and then go through an activation layer.
I hope this is now clearer ?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With