Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Copying weights of a specific layer - keras

According to this the following copies weights from one model to another:

target_model.set_weights(model.get_weights())

What about copying the weights of a specific layer, would this work?

model_1.layers[0].set_weights(source_model.layers[0].get_weights())
model_2.layers[0].set_weights(source_model.layers[0].get_weights())

If I train model_1 and model_2 will they have separate weights? The documentation doesn't state whether if this get_weights makes a deep copy or not. If this doesn't work, how can this be achieved?

like image 432
bones.felipe Avatar asked Nov 15 '18 07:11

bones.felipe


People also ask

How do you freeze layer weights in keras?

Freeze the required layers In Keras, each layer has a parameter called “trainable”. For freezing the weights of a particular layer, we should set this parameter to False, indicating that this layer should not be trained. That's it!

What is layer trainable false?

trainable to False moves all the layer's weights from trainable to non-trainable. This is called "freezing" the layer: the state of a frozen layer won't be updated during training (either when training with fit() or when training with any custom loop that relies on trainable_weights to apply gradient updates).


1 Answers

Of course, it would be a copy of the weights. It does not make sense the weights object to be shared between two separate models. You can check it for yourself with a simple example like this:

model1 = Sequential()
model1.add(Dense(10, input_dim=2))

model2 = Sequential()
model2.add(Dense(10, input_dim=2))

model1.compile(loss='mse', optimizer='adam')
model2.compile(loss='mse', optimizer='adam')

Test:

>>> model1.layers[0].get_weights()
[array([[-0.42853734,  0.18648076, -0.47137827,  0.1792168 ,  0.0373047 ,
          0.2765705 ,  0.38383502,  0.09664273, -0.4971757 ,  0.41548246],
        [ 0.0403192 , -0.01309097,  0.6656211 , -0.0536288 ,  0.58677703,
          0.21625364,  0.26447064, -0.42619988,  0.17218047, -0.39748642]],
       dtype=float32),
 array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)]

>>> model2.layers[0].get_weights()
[array([[-0.30062824, -0.3740575 , -0.3502644 ,  0.28050178, -0.68631136,
          0.1596322 ,  0.08288956, -0.20988202,  0.34323698,  0.2893324 ],
        [-0.29182747, -0.2754455 , -0.64082885,  0.29160154,  0.04342002,
         -0.4996035 ,  0.6608283 ,  0.10293472,  0.11375248, -0.43438092]],
       dtype=float32),
 array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)]

>>> model2.layers[0].set_weights(model1.layers[0].get_weights())
>>> model2.layers[0].get_weights()
[array([[-0.42853734,  0.18648076, -0.47137827,  0.1792168 ,  0.0373047 ,
          0.2765705 ,  0.38383502,  0.09664273, -0.4971757 ,  0.41548246],
        [ 0.0403192 , -0.01309097,  0.6656211 , -0.0536288 ,  0.58677703,
          0.21625364,  0.26447064, -0.42619988,  0.17218047, -0.39748642]],
       dtype=float32),
 array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)]

>>> id(model1.layers[0].get_weights()[0])
140494823634144

>>> id(model2.layers[0].get_weights()[0])
140494823635664

The ids of kernel weights arrays are different so they are different objects, but with the same value.

like image 169
today Avatar answered Oct 26 '22 04:10

today