Using Keras (1.2.2), I am loading a sequential model whose last layers are:
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
Then, I want to pop the last layer, add another fully connected layer, and re-add the classification layer.
model = load_model('model1.h5')
layer1 = model.layers.pop() # Copy activation_6 layer
layer2 = model.layers.pop() # Copy classification layer (dense_2)
model.add(Dense(512, name='dense_3'))
model.add(Activation('softmax', name='activation_7'))
model.add(layer2)
model.add(layer1)
print(model.summary())
As you can see my dense_3 and activation_7 did not connect to the network (Empty value in summary() with "Connected to"). I cannot find anything in the documentation that explains how to solve this problem. Any ideas?
dense_1 (Dense) (None, 512) 131584 flatten_1[0][0]
____________________________________________________________________________________________________
activation_5 (Activation) (None, 512) 0 dense_1[0][0]
____________________________________________________________________________________________________
dense_3 (Dense) (None, 512) 5632
____________________________________________________________________________________________________
activation_7 (Activation) (None, 512) 0
____________________________________________________________________________________________________
dense_2 (Dense) (None, 10) 5130 activation_5[0][0]
____________________________________________________________________________________________________
activation_6 (Activation) (None, 10) 0 dense_2[0][0]
====================================================================================================
Following the answer below, I compiled the model before printing out model.summary()
, but for some reasons, the layers are not being popped correctly, as the summary shows: The last layer's connections are wrong:
dense_1 (Dense) (None, 512) 131584 flatten_1[0][0]
____________________________________________________________________________________________________
activation_5 (Activation) (None, 512) 0 dense_1[0][0]
____________________________________________________________________________________________________
dense_3 (Dense) (None, 512) 5632 activation_6[0][0]
____________________________________________________________________________________________________
activation_7 (Activation) (None, 512) 0 dense_3[0][0]
____________________________________________________________________________________________________
dense_2 (Dense) (None, 10) 5130 activation_5[0][0]
activation_7[0][0]
____________________________________________________________________________________________________
activation_6 (Activation) (None, 10) 0 dense_2[0][0]
dense_2[1][0]
====================================================================================================
But it should be
dense_1 (Dense) (None, 512) 131584 flatten_1[0][0]
____________________________________________________________________________________________________
activation_5 (Activation) (None, 512) 0 dense_1[0][0]
____________________________________________________________________________________________________
dense_3 (Dense) (None, 512) 5632 activation_5[0][0]
____________________________________________________________________________________________________
activation_7 (Activation) (None, 512) 0 dense_3[0][0]
____________________________________________________________________________________________________
dense_2 (Dense) (None, 10) 5130
activation_7[0][0]
____________________________________________________________________________________________________
activation_6 (Activation) (None, 10) 0 dense_2[0][0]
====================================================================================================
Explanation. The 'pop' function can be called by associating the name of the model with the function using dot operator. Once this is done, the length of the layers can be checked. This will help confirm that one of the layers has actually been deleted.
VGG net can be viewd as the combination of two sub-nets: feature extracting net and classifying net, and each of them is a nn. Sequential module. I just remove the last fc layer in classifying net by constructing a new nn. Sequential module with the pretrained parameters.
layers. pop() to remove the last layer.
Note that there's also a corresponding pop() method to remove layers: a Sequential model behaves very much like a list of layers.
In tf.keras, model.layers will return a shallow copy version of the layers list, so actually you don't remove that layer, just remove the layer in the return value. If you want to remove the last dense layer and add your own one, you should use hidden = Dense(120, activation='relu')(model.layers[-2].output) .
Generally, all layers in Keras need to know the shape of their inputs in order to be able to create their weights. So when you create a layer like this, initially, it has no weights: It creates its weights the first time it is called on an input, since the shape of the weights depends on the shape of the inputs:
Dense is an entry level layer provided by Keras, which accepts the number of neurons or units (32) as its required parameter. If the layer is first layer, then we need to provide Input Shape, (16,) as well. Otherwise, the output of the previous layer will be used as input of the next layer.
Creating a Sequential model. You can create a Sequential model by passing a list of layers to the Sequential constructor: model = keras.Sequential( [ layers.Dense(2, activation="relu"), layers.Dense(3, activation="relu"), layers.Dense(4), ] ) Its layers are accessible via the layers attribute: model.layers.
When you drop layers, you need to recompile your model in order for it to have any effect.
So use
model.compile(loss=...,optimizer=..., ...)
before printing the summary and it should integrate the changes correctly.
Edit :
What you are trying to do is actually really complex with a Sequential mode. This is the solution I can come up with (if there is any better please tell me) for your Sequential model:
model = load_model('model1.h5')
layer1 = model.layers.pop() # Copy activation_6 layer
layer2 = model.layers.pop() # Copy classification layer (dense_2)
model.add(Dense(512, name='dense_3'))
model.add(Activation('softmax', name='activation_7'))
# get layer1 config
layer1_config = layer1.get_config()
layer2_config = layer2.get_config()
# change the name of the layers otherwise it complains
layer1_config['name'] = layer1_config['name'] + '_new'
layer2_config['name'] = layer2_config['name'] + '_new'
# import the magic function
from keras.utils.layer_utils import layer_from_config
# re-add new layers from the config of the old ones
model.add(layer_from_config({'class_name':type(l2), 'config':layer2_config}))
model.add(layer_from_config({'class_name':type(l1), 'config':layer1_config}))
model.compile(...)
print(model.summary())
The hack is in the fact that your layers have layer1.input
and layer1.output
properties that I couldn't change.
A way around that is to use a Functionnal API model. This allows you to define what comes in and what goes out of your layers.
First you need to define your pop() function, to properly relink the layers every time you pop one, the function comes from this github issue:
def pop_layer(model):
if not model.outputs:
raise Exception('Sequential model cannot be popped: model is empty.')
popped_layer = model.layers.pop()
if not model.layers:
model.outputs = []
model.inbound_nodes = []
model.outbound_nodes = []
else:
model.layers[-1].outbound_nodes = []
model.outputs = [model.layers[-1].output]
model.built = False
return popped_layer
it just removes every output links of the last layer and changes the outputs of the model to be the new last layer. Now you can use this in :
model = load_model('model1.h5')
layer1 = model.layers.pop() # Copy activation_6 layer
layer2 = model.layers.pop() # Copy classification layer (dense_2)
# take model.outputs and feed a Dense layer
h = Dense(512,name='dense_3')(model.outputs)
h = Activation('relu', name=('activation_7')(h)
# apply
h = layer2(h)
output = layer1(h)
model = Model(input=model.input, output=output)
model.compile(...)
model.summary()
There are probably better solutions than this, but this is what I would do.
I hope this helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With