Tying Autoencoder Weights in a Dense Keras Layer

Tags:

I am attempting to create a custom, Dense layer in Keras to tie weights in an Autoencoder. I have tried following an example for doing this in convolutional layers here, but it seemed like some of the steps did not apply for the Dense layer (also, the code is from over two years ago).

By tying weights, I want the decode layer to use the transposed weight matrix of the encode layer. This approach is also taken in this article (page 5). Below is the relevant quote from the article:

Here, we choose both the encoding and decoding activation function to be sigmoid function and only consider the tied weights case, in which W ′ = W^T (where W^T is the transpose of W ) as most existing deep learning methods do.

In the quote above, W is the weight matrix in the encode layer and W' (equal to the transpose of W) is the weight matrix in the decode layer.

I did not change too much in the dense layer. I added a tied_to parameter to the constructor, which allows you to pass the layer you want to tie it to. The only other change was to the build function, the snippet for this is below:

def build(self, input_shape):
    assert len(input_shape) >= 2
    input_dim = input_shape[-1]

    if self.tied_to is not None:
        self.kernel = K.transpose(self.tied_to.kernel)
        self._non_trainable_weights.append(self.kernel)
    else:
        self.kernel = self.add_weight(shape=(input_dim, self.units),
                                      initializer=self.kernel_initializer,
                                      name='kernel',
                                      regularizer=self.kernel_regularizer,
                                      constraint=self.kernel_constraint)
    if self.use_bias:
        self.bias = self.add_weight(shape=(self.units,),
                                    initializer=self.bias_initializer,
                                    name='bias',
                                    regularizer=self.bias_regularizer,
                                    constraint=self.bias_constraint)
    else:
        self.bias = None
    self.input_spec = InputSpec(min_ndim=2, axes={-1: input_dim})
    self.built = True

Below is the __init__ method, the only change here was the addition of the tied_to parameter.

def __init__(self, units,
             activation=None,
             use_bias=True,
             kernel_initializer='glorot_uniform',
             bias_initializer='zeros',
             kernel_regularizer=None,
             bias_regularizer=None,
             activity_regularizer=None,
             kernel_constraint=None,
             bias_constraint=None,
             tied_to=None,
             **kwargs):
    if 'input_shape' not in kwargs and 'input_dim' in kwargs:
        kwargs['input_shape'] = (kwargs.pop('input_dim'),)
    super(Dense, self).__init__(**kwargs)
    self.units = units
    self.activation = activations.get(activation)
    self.use_bias = use_bias
    self.kernel_initializer = initializers.get(kernel_initializer)
    self.bias_initializer = initializers.get(bias_initializer)
    self.kernel_regularizer = regularizers.get(kernel_regularizer)
    self.bias_regularizer = regularizers.get(bias_regularizer)
    self.activity_regularizer = regularizers.get(activity_regularizer)
    self.kernel_constraint = constraints.get(kernel_constraint)
    self.bias_constraint = constraints.get(bias_constraint)
    self.input_spec = InputSpec(min_ndim=2)
    self.supports_masking = True
    self.tied_to = tied_to

The call function was not edited, but it is below for reference.

def call(self, inputs):
    output = K.dot(inputs, self.kernel)
    if self.use_bias:
        output = K.bias_add(output, self.bias, data_format='channels_last')
    if self.activation is not None:
        output = self.activation(output)
    return output

Above, I added a conditional to check if the tied_to parameter was set, and if so, set the layer's kernel to the transpose of the tied_to layer's kernel.

Below is the code used to instantiate the model. It is done using Keras's sequential API and DenseTied is my custom layer.

# encoder
#
encoded1 = Dense(2, activation="sigmoid")

decoded1 = DenseTied(4, activation="sigmoid", tied_to=encoded1)

# autoencoder
#
autoencoder = Sequential()
autoencoder.add(encoded1)
autoencoder.add(decoded1)

After training the model, below is the model summary and weights.

autoencoder.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_7 (Dense)              (None, 2)                 10        
_________________________________________________________________
dense_tied_7 (DenseTied)     (None, 4)                 12        
=================================================================
Total params: 22
Trainable params: 14
Non-trainable params: 8
________________________________________________________________

autoencoder.layers[0].get_weights()[0]
array([[-2.122982  ,  0.43029135],
       [-2.1772149 ,  0.16689162],
       [-1.0465667 ,  0.9828905 ],
       [-0.6830663 ,  0.0512633 ]], dtype=float32)


autoencoder.layers[-1].get_weights()[1]
array([[-0.6521988 , -0.7131109 ,  0.14814234,  0.26533198],
       [ 0.04387903, -0.22077179,  0.517225  , -0.21583867]],
      dtype=float32)

As you can see, the weights reported by autoencoder.get_weights() do not seem to be tied.

So after showing my approach, my question is, is this a valid way to tie weights in a Dense Keras layer? I was able to run the code, and it is currently training. It seems that the loss function is decreasing reasonably as well. My fear is that this will only set them equal when the model is build, but not actually tie them. My hope is that the backend transpose function is tying them through references under the hood, but I am sure that I am missing something.

235

asked Dec 12 '18 20:12

James Mchugh

1 Answers

So after showing my approach, my question is, is this a valid way to tie weights in a Dense Keras layer?

Yes, it's valid.

My fear is that this will only set them equal when the model is build, but not actually tie them. My hope is that the backend transpose function is tying them through references under the hood, but I am sure that I am missing something.

It actually ties them in a computation graph, you can check in printing model.summary() that there's just one copy of these trainable weights. Also, after training your model you can check weights of corresponding layers with model.get_weights(). When the model is build there're no weights yet actually, just placeholders for them.

random.seed(1)

class DenseTied(Layer):
    def __init__(self, units,
                 activation=None,
                 use_bias=True,
                 kernel_initializer='glorot_uniform',
                 bias_initializer='zeros',
                 kernel_regularizer=None,
                 bias_regularizer=None,
                 activity_regularizer=None,
                 kernel_constraint=None,
                 bias_constraint=None,
                 tied_to=None,
                 **kwargs):
        self.tied_to = tied_to
        if 'input_shape' not in kwargs and 'input_dim' in kwargs:
            kwargs['input_shape'] = (kwargs.pop('input_dim'),)
        super().__init__(**kwargs)
        self.units = units
        self.activation = activations.get(activation)
        self.use_bias = use_bias
        self.kernel_initializer = initializers.get(kernel_initializer)
        self.bias_initializer = initializers.get(bias_initializer)
        self.kernel_regularizer = regularizers.get(kernel_regularizer)
        self.bias_regularizer = regularizers.get(bias_regularizer)
        self.activity_regularizer = regularizers.get(activity_regularizer)
        self.kernel_constraint = constraints.get(kernel_constraint)
        self.bias_constraint = constraints.get(bias_constraint)
        self.input_spec = InputSpec(min_ndim=2)
        self.supports_masking = True

    def build(self, input_shape):
        assert len(input_shape) >= 2
        input_dim = input_shape[-1]

        if self.tied_to is not None:
            self.kernel = K.transpose(self.tied_to.kernel)
            self._non_trainable_weights.append(self.kernel)
        else:
            self.kernel = self.add_weight(shape=(input_dim, self.units),
                                          initializer=self.kernel_initializer,
                                          name='kernel',
                                          regularizer=self.kernel_regularizer,
                                          constraint=self.kernel_constraint)
        if self.use_bias:
            self.bias = self.add_weight(shape=(self.units,),
                                        initializer=self.bias_initializer,
                                        name='bias',
                                        regularizer=self.bias_regularizer,
                                        constraint=self.bias_constraint)
        else:
            self.bias = None

        self.built = True

    def compute_output_shape(self, input_shape):
        assert input_shape and len(input_shape) >= 2
        assert input_shape[-1] == self.units
        output_shape = list(input_shape)
        output_shape[-1] = self.units
        return tuple(output_shape)

    def call(self, inputs):
        output = K.dot(inputs, self.kernel)
        if self.use_bias:
            output = K.bias_add(output, self.bias, data_format='channels_last')
        if self.activation is not None:
            output = self.activation(output)
        return output


# input_ = Input(shape=(16,), dtype=np.float32)
# encoder
#
encoded1 = Dense(4, activation="sigmoid", input_shape=(4,), use_bias=True)
decoded1 = DenseTied(4, activation="sigmoid", tied_to=encoded1, use_bias=False)

# autoencoder
#
autoencoder = Sequential()
# autoencoder.add(input_)
autoencoder.add(encoded1)
autoencoder.add(decoded1)

autoencoder.compile(optimizer="adam", loss="binary_crossentropy")

print(autoencoder.summary())

autoencoder.fit(x=np.random.rand(100, 4), y=np.random.randint(0, 1, size=(100, 4)))

print(autoencoder.layers[0].get_weights()[0])
print(autoencoder.layers[1].get_weights()[0])

160

answered Sep 22 '22 01:09

Mikhail Berlinkov

Related questions
                            
                                get updated screen size in python curses
                            
                                AttributeError: 'tuple' object has no attribute 'dim', when feeding input to Pytorch LSTM network
                            
                                How to check if tensorflow is using all available GPU's
                            
                                nested "async with" using aiohttp
                            
                                Plotting a grouped pandas data in plotly
                            
                                Vectorize a 6 for loop cumulative sum in python
                            
                                How to mock patch every method of a class
                            
                                How to call a specific sheet within a spreadsheet via the Google Sheets API v4 in Python
                            
                                Python Google docstring format: more than one type for argument?
                            
                                How to compute Spearman correlation in Tensorflow
                            
                                AttributeError: Cannot access callable attribute 'groupby' of 'DataFrameGroupBy' objects
                            
                                How does `images, labels = dataiter.next() ` work in PyTorch Tutorial?
                            
                                Python logger file permissions
                            
                                How can you implement Householder based QR decomposition in Python?
                            
                                Is there a solution for required mutually exclusive arguments listed as optional in help section?
                            
                                Map python array into a x times longer array using each element x times
                            
                                How to get rid of white lines in confusion matrix?
                            
                                Call python script from .Net Core using pythonnet
                            
                                Django Tutorial: 'detail' is not a valid view function or pattern name
                            
                                Reshape vertical series to horizontal in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Tying Autoencoder Weights in a Dense Keras Layer

Tags:

python

keras

autoencoder

James Mchugh

People also ask

1 Answers

Mikhail Berlinkov

Recent Activity

Donate For Us