Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Back propagation from decoder input to encoder output in variational autoencoder

I am trying to understand VAE in-depth by implementing it by myself and having difficulties when back-propagate losses of the decoder input layer to the encoder output layer.

VAE

My encoder network outputs 8 pairs (sigma, mu) which I then combine with the result of a stochastic sampler to produce 8 input values (z) for the decoder network:

decoder_in = sigma * N(0,I) + mu

Then I run forward propagation for the decoder network, compute MSE reconstruction loss and back-propagate weights, and losses up to the decoder input layer.

Here I stuck completely since there is no comprehensible explanation of how to back-propagate losses from the decoder input layer to the encoder output layer.

My best idea was to store the results of sampling from N(0,I) to (epsilon) and use them in such a way:

L(sigma) = epsilon * dLz(decoder_in)
L(mu) = 1.0 * dLz(decoder_in)

It kind of works, but in the long run the sigma components of the encoded vector of distributions tend to regress to zeroes, so my VAE as a result also regressed to AE.

Also, I still have no clue how to integrate KL-loss in this scheme. Should I add it to the encoder loss or somehow combine it with the decoder MSE loss?

like image 627
game development germ Avatar asked Aug 05 '20 05:08

game development germ


People also ask

How does a variational autoencoder work?

Explained! Variational Autoencoder is a an explicit type generative model which is used to generate new sample data using past data. VAEs do a mapping between latent variables, dominate to explain the training data and underlying distribution of the training data.

How is variational autoencoder different from autoencoder?

What is it? Variational autoencoder addresses the issue of non-regularized latent space in autoencoder and provides the generative capability to the entire space. The encoder in the AE outputs latent vectors.

What is Elbo in variational autoencoder?

ELBO is a lower bound of the logarithm of the marginal likelihood log p x ( x ; θ ) and constructed by introducing an extra distribution . The closer and the posterior p z | x ( ⋅ | x ; θ ) are, the tighter the bound is. The EM algorithm and VAE both iteratively optimize ELBO.

What is variational autoencoder VAE tutorial?

An Autoencoder is made of a pair of two connected neural networks: an encoder model and a decoder model. Its goal is to find a way to encode the celebrity faces into a compressed form (latent space) in such a way that the reconstructed version is as close as possible to the input.

What are variational encoders in autoencoder?

To alleviate the issues present in a vanilla Autoencoder, we turn to Variational Encoders. The first change it introduces to the network is instead of directly mapping the input data points into latent variables the input data points get mapped to a multivariate normal distribution.

Are Variational autoencoders (VAE) eligible for backpropagation?

In this article, we are going to learn about the “reparameterization” trick that makes Variational Autoencoders (VAE) an eligible candidate for Backpropagation. First, we will discuss Autoencoders briefly and the problems that come with their vanilla variants. Then we will jump straight to the crux of the article — the “reparameterization” trick.

What is the function of decoder in autoencoder?

The function of the decoder is to generate an output from the latent vector that is very close to the input. Usually, in training autoencoders, we build these components together instead of building them independently.

What is the difference between general and variant autoencoders?

General autoencoders are trained using a reconstruction loss, which measures the difference between the reconstructed and original image. Variational autoencoders are mostly the same, but they use a sampling of the bottleneck vector from a normal distribution to reduce overfitting.


2 Answers

The VAE does not use the reconstruction error as the cost objective if you use that the model just turns back into an autoencoder. The VAE uses the variational lower bound and a couple of neat tricks to make it easy to compute.

Referring to the original “auto-encoding variational bayes” paper

The variational lower bound objective is (eq 10):

1/2( d+log(sigmaTsigma) -(muTmu) - (sigmaTsigma)) + log p(x/z)

Where d is number of latent variable, mu and sigma is the output of the encoding neural network used to scale the standard normal samples and z is the encoded sample. p(x/z) is just the decoder probability of generating back the input x.

All the variables in the above equation are completely differentiable and hence can be optimized with gradient descent or any other gradient based optimizer you find in tensorflow

like image 52
Sarin Chandy Avatar answered Nov 30 '22 08:11

Sarin Chandy


From what I understand, the solution should look like this:

L(sigma) = epsilon * dLz(decoder_in) - 0.5 * 2 / sigma + 0.5 * 2 * sigma
L(mu) = 1.0 * dLz(decoder_in) + 0.5 * 2 * mu
like image 37
Роман Проценко Avatar answered Nov 30 '22 08:11

Роман Проценко