Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert a dense layer to an equivalent convolutional layer in Keras?

Tags:

I would like to do something similar to the Fully Convolutional Networks paper (https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdf) using Keras. I have a network which ends up flattening the feature maps and runs them through several dense layers. I would like to load the weights from a network like this into one where the dense layers are replaced with equivalent convolutions.

The VGG16 network which comes with Keras could be used as an example, where the 7x7x512 output of the last MaxPooling2D() is flattened and then goes into a Dense(4096) layer. In this case the Dense(4096) would be replaced with a 7x7x4096 convolution.

My real network is slightly different, there is a GlobalAveragePooling2D() layer instead of MaxPooling2D() and Flatten(). The output of GlobalAveragePooling2D() is a 2D tensor, and there is no need to additionally flatten it, so all the dense layers including the first would be replaced with 1x1 convolutions.

I've seen this question: Python keras how to transform a dense layer into a convolutional layer which seems very similar if not identical. The trouble is I can't get the suggested solution to work, because (a) I'm using TensorFlow as the backend, so the weights rearrangement/filter "rotation" isn't right, and (b) I can't figure out how to load the weights. Loading the old weights file into the new network with model.load_weights(by_name=True) doesn't work, because the names don't match (and even if they did the dimensions differ).

What should the rearrangement be when using TensorFlow?

How do I load the weights? Do I create one of each model, call model.load_weights() on both to load the identical weights and then copy some of the extra weights that need rearrangement?

like image 911
Alex I Avatar asked Dec 15 '16 09:12

Alex I


People also ask

How do you convert dense layer to convolutional layer?

A fully convolution network can be built by simply replacing the FC layers with there equivalent Conv layers. In the example of VGG16 we can do so by first removing the last four layers. One way to do so is to pop layers from the model. In the model stack, each popping will remove the last layer.

Is dense layer CNN?

Dense Layer is simple layer of neurons in which each neuron receives input from all the neurons of previous layer, thus called as dense. Dense Layer is used to classify image based on output from convolutional layers.

What is the difference between dense layer and convolutional layer?

As known, the main difference between the Convolutional layer and the Dense layer is that Convolutional Layer uses fewer parameters by forcing input values to share the parameters. The Dense Layer uses a linear operation meaning every output is formed by the function based on every input.

What does TF keras layers dense () do?

This function is used to create fully connected layers, in which every output depends on every input. Parameters: This function takes the args object as a parameter which can have the following properties: units: It is a positive number that defines the dimensionality of the output space.


2 Answers

Based on the answer of hars, I created this function to transform an arbitrary cnn into a fcn:

from keras.models import Sequential
from keras.layers.convolutional import Convolution2D
from keras.engine import InputLayer
import keras

def to_fully_conv(model):

    new_model = Sequential()

    input_layer = InputLayer(input_shape=(None, None, 3), name="input_new")

    new_model.add(input_layer)

    for layer in model.layers:

        if "Flatten" in str(layer):
            flattened_ipt = True
            f_dim = layer.input_shape

        elif "Dense" in str(layer):

            input_shape = layer.input_shape
            output_dim =  layer.get_weights()[1].shape[0]
            W,b = layer.get_weights()

            if flattened_ipt:
                shape = (f_dim[1],f_dim[2],f_dim[3],output_dim)
                new_W = W.reshape(shape)
                new_layer = Convolution2D(output_dim,
                                          (f_dim[1],f_dim[2]),
                                          strides=(1,1),
                                          activation=layer.activation,
                                          padding='valid',
                                          weights=[new_W,b])
                flattened_ipt = False

            else:
                shape = (1,1,input_shape[1],output_dim)
                new_W = W.reshape(shape)
                new_layer = Convolution2D(output_dim,
                                          (1,1),
                                          strides=(1,1),
                                          activation=layer.activation,
                                          padding='valid',
                                          weights=[new_W,b])


        else:
            new_layer = layer

        new_model.add(new_layer)

    return new_model

you can test the function like this:

model = keras.applications.vgg16.VGG16()
new_model = to_fully_conv(model)
like image 145
Oliver Wilken Avatar answered Sep 23 '22 20:09

Oliver Wilken


a. No need to do complicated rotation. Just reshape is working

b. Use get_weights() and initialize new layer

Iterate through the model.layers, create same layer with config and load weights using set_weights or as shown below.

Following piece of pseudo code works for me. (Keras 2.0)

Pseudo Code:

# find input dimensions of Flatten layer
f_dim =  flatten_layer.input_shape

# Creating new Conv layer and putting dense layers weights 
m_layer = model.get_layer(layer.name)
input_shape = m_layer.input_shape
output_dim =  m_layer.get_weights()[1].shape[0]
W,b = layer.get_weights()
if first dense layer :
    shape = (f_dim[1],f_dim[2],f_dim[3],output_dim)
    new_W = W.reshape(shape)
    new_layer = Convolution2D(output_dim,(f_dim[1],f_dim[2]),strides=(1,1),activation='relu',padding='valid',weights=[new_W,b])

else: (not first dense layer)
    shape = (1,1,input_shape[1],output_dim)
    new_W = W.reshape(shape)
    new_layer = Convolution2D(output_dim,(1,1),strides=(1,1),activation='relu',padding='valid',weights=[new_W,b])
            
like image 38
Harsha Pokkalla Avatar answered Sep 23 '22 20:09

Harsha Pokkalla