Train Stacked Autoencoder Correctly

Question

I try to build a Stacked Autoencoder in Keras (tf.keras). By stacked I do not mean deep. All the examples I found for Keras are generating e.g. 3 encoder layers, 3 decoder layers, they train it and they call it a day. However, it seems the correct way to train a Stacked Autoencoder (SAE) is the one described in this paper: Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion

In short, a SAE should be trained layer-wise as shown in the image below. After layer 1 is trained, it's used as input to train layer 2. The reconstruction loss should be compared with the layer 1 and not the input layer.

And here is where my trouble begins. How to tell Keras which layers to use the loss function on?

Here is what I do. Since the Autoencoder module is not existed anymore in Keras, I build the first autoencoder, and I set its encoder's weights (trainable = False) in the 1st layer of a second autoencoder with 2 layers in total. Then when I train that, it obviously compares the reconstructed layer out_s2 with the input layer in_s, instead of the layer 1 hid1.

# autoencoder layer 1
in_s = tf.keras.Input(shape=(input_size,))
noise = tf.keras.layers.Dropout(0.1)(in_s)
hid = tf.keras.layers.Dense(nodes[0], activation='relu')(noise)
out_s = tf.keras.layers.Dense(input_size, activation='sigmoid')(hid)

ae_1 = tf.keras.Model(in_s, out_s, name="ae_1")
ae_1.compile(optimizer='nadam', loss='binary_crossentropy', metrics=['acc'])

# autoencoder layer 2
hid1 = tf.keras.layers.Dense(nodes[0], activation='relu')(in_s)
noise = tf.keras.layers.Dropout(0.1)(hid1)
hid2 = tf.keras.layers.Dense(nodes[1], activation='relu')(noise)
out_s2 = tf.keras.layers.Dense(nodes[0], activation='sigmoid')(hid2)

ae_2 = tf.keras.Model(in_s, out_s2, name="ae_2")
ae_2.layers[0].set_weights(ae_1.layers[0].get_weights())
ae_2.layers[0].trainable = False

ae_2.compile(optimizer='nadam', loss='binary_crossentropy', metrics=['acc'])

The solution should be fairly easy, but I can't see it nor find it online. How do I do that in Keras?

Kadam Parikh · Accepted Answer

It seems like the question is outdated by looking at the comments. But I'll still answer this as the use-case mentioned in this question is not just specific to autoencoders and might be helpful for some other cases.

So, when you say "train the whole network layer by layer", I would rather interpret it as "train small networks with one single layer in a sequence".

Looking at the code posted in this question, it seems that the OP has already built small networks. But both these networks do not consist of one single layer.

The second autoencoder here, takes as input the input of first autoencoder. But, it should actually take as input, the output of first autoencoder.

So then, you train the first autoencoder and collect it's predicitons after it is trained. Then you train the second autoencoder, which takes as input the output (predictions) of first autoencoder.

Now let's focus on this part: "After layer 1 is trained, it's used as input to train layer 2. The reconstruction loss should be compared with the layer 1 and not the input layer."

Since the network takes as input the output of layer 1 (autoencoder 1 in OP's case), it will be comparing it's output with this. The task is achieved.

But to achieve this, you will need to write the model.fit(...) line which is missing in the code provided in the question.

Also, just in case you want the model to calculate loss on input layer, you simply replace the y parameter in model,fit(...) to the input of autoencoder 1.

In short, you just need to decouple these autoencoders into tiny networks with one single layer and then train them as you wish. No need to use trainable = False now, or else use it as you wish.

Train Stacked Autoencoder Correctly

Tags:

python

machine-learning

tensorflow

deep-learning

keras

Glrs

1 Answers

Kadam Parikh

Recent Activity

Donate For Us

Train Stacked Autoencoder Correctly

Tags:

python

machine-learning

tensorflow

deep-learning

keras

Glrs

1 Answers

Kadam Parikh

Related questions

Recent Activity

Donate For Us