Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does `layer.get_weights()` return?

Tags:

python

keras

I'm using Keras to do some experiments and I just monitored the weight update for a simple mlp model:

# model contains one input layer in the format of dense, 
# one hidden layer and one output layer.
model=mlp() 
weight_origin=model.layers[0].get_weights()[0]
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(.....) # with adam optimizer
weight_updated=model.layers[0].get_weights()[0]
print weight_origin-weight_updated

For the first dense layer, I got a matrix of zeros. I thought the training doesn't change this weight. However, weights in other layers are changed. So I'm confused, why the first layer is unchanged? I checked the source code but still got no answer, then I tried monitoring:

model.layers[0].get_weights()[1] # get_weight() returns a list of weights

This time, the weights did change. So I'm wondering which weight is the "true" weight that does working during training? Why there are two elements in the weight list?


Definition of mlp():

def mlp():
    model=Sequential()
    model.add(Dense(500, input_dim=784))
    model.add(Dense(503,init='normal',activation='relu'))
    model.add(Dense(503,init='normal',activation='relu'))
    model.add(Dense(10, activation='softmax'))
    return model
like image 391
Ludwig Zhou Avatar asked Jan 22 '17 14:01

Ludwig Zhou


3 Answers

For the question of layer.get_weights():

I ran some tests on this issue and checked the source codes. I found that the Dense layer is a subclass of Layer and its weights, which is a type of python list has two elements weight of the layer stored at layer.get_weights()[0] and the bias is stored at layer.get_weights()[1].

There's one thing to note that, bias can be disabled during defining the layer: model.add(Dense(503,init='normal',activation='relu',bias=False)). In that case, the list layer.get_weights() has only one element. If you set the bias attribute as False after defining it, there will still be an element for bias and it would be updated after you fitting the model.

For the question of not updating:

I set up a Sequential model with only one dense layer:

def mlp_2(): model=Sequential() model.add(Dense(10, input_dim=784, activation='softmax', bias =False)) return model

Then I use the same way above to compile and fit it. This is what I got:

enter image description here

It still seems not update the weight, however, we can tell the weight is definately changed. Because the accuracy is increasing. I think the only explanation is updates on the first dense layer (which you define input_dim) is too small for Keras to printout. I didn't check the more precise value of the weights, it would be great if someone could confrim it.

like image 145
Ludwig Zhou Avatar answered Oct 17 '22 03:10

Ludwig Zhou


Here is a working example.

import numpy as np
from  keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten

X_train=np.random.rand(1,10)
Y_train=2*X_train
input_dim = X_train.shape[1]
model=Sequential()
model.add(Dense(20, input_dim=10))
model.add(Dense(10, activation='softmax'))
weight_origin_0=model.layers[0].get_weights()[0]
weight_origin_1=model.layers[1].get_weights()[0]
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, Y_train, batch_size=1, nb_epoch=10, verbose=1)
print(weight_origin_0-model.layers[0].get_weights()[0])  #the first layer
print(weight_origin_1-model.layers[1].get_weights()[0])  #the second layer
like image 30
kimman Avatar answered Oct 17 '22 05:10

kimman


There is a way to see exactly how the values of all weights and biases are changing over time. You can can use the Keras callback method which can be used to record the weights values at each training epoch. Using a model like this for example,

import numpy as np
model = Sequential([Dense(16, input_shape=(train_inp_s.shape[1:])), Dense(12), Dense(6), Dense(1)])

add the callbacks **kwarg during fitting:

gw = GetWeights()
model.fit(X, y, validation_split=0.15, epochs=10, batch_size=100, callbacks=[gw])

where the callback is defined by

class GetWeights(Callback):
    # Keras callback which collects values of weights and biases at each epoch
    def __init__(self):
        super(GetWeights, self).__init__()
        self.weight_dict = {}

    def on_epoch_end(self, epoch, logs=None):
        # this function runs at the end of each epoch

        # loop over each layer and get weights and biases
        for layer_i in range(len(self.model.layers)):
            w = self.model.layers[layer_i].get_weights()[0]
            b = self.model.layers[layer_i].get_weights()[1]
            print('Layer %s has weights of shape %s and biases of shape %s' %(
                layer_i, np.shape(w), np.shape(b)))

            # save all weights and biases inside a dictionary
            if epoch == 0:
                # create array to hold weights and biases
                self.weight_dict['w_'+str(layer_i+1)] = w
                self.weight_dict['b_'+str(layer_i+1)] = b
            else:
                # append new weights to previously-created weights array
                self.weight_dict['w_'+str(layer_i+1)] = np.dstack(
                    (self.weight_dict['w_'+str(layer_i+1)], w))
                # append new weights to previously-created weights array
                self.weight_dict['b_'+str(layer_i+1)] = np.dstack(
                    (self.weight_dict['b_'+str(layer_i+1)], b))

This callback builds a dictionary with all the layer weights and biases, labeled by the layer numbers, so you can see how they are changing over time as your model is being trained. You'll notice that the shape of each weight and bias array depends on the shape of the model layer. One weights array and one bias array are saved for each layer in your model. The third axis (depth) shows their evolution over time.

Here we used 10 epochs and a model with layers of 16, 12, 6, and 1 neurons:

for key in gw.weight_dict:
    print(str(key) + ' shape: %s' %str(np.shape(gw.weight_dict[key])))

w_1 shape: (5, 16, 10)
b_1 shape: (1, 16, 10)
w_2 shape: (16, 12, 10)
b_2 shape: (1, 12, 10)
w_3 shape: (12, 6, 10)
b_3 shape: (1, 6, 10)
w_4 shape: (6, 1, 10)
b_4 shape: (1, 1, 10)
like image 2
Eric M Avatar answered Oct 17 '22 03:10

Eric M