Does tensorflow propagate gradients through a pdf

Tags:

tensorflow

Lets say, a distribution function is defined as below:

dist = tf.contrib.distributions.Normal(mu, sigma)

and a sample is drawn from the distribution

val = dist.pdf(x)

and this value is used in a model to predict a variable

X_hat = f(val)
loss = tf.norm(X_pred-X_hat, ord=2)

and if I want to optimize the variables mu and sigma to reduce my prediction error can I do the following?

train = tf.train.AdamOptimizer(1e-03).minimize(loss, var_list=[mu, sigma])

I am interested in knowing if the gradient routines are propagated through the normal distribution, or should I expect some issues because I am taking gradients over the parameters defining a distribution

217

asked Apr 08 '18 20:04

knk

1 Answers

tl;dr: Yes, gradient back propagation will work correctly with tf.distributions.Normal.

dist.pdf(x) does not draw a sample from the distribution, but rather returns the probability density function at x. This is probably not what you wanted.

To get a random sample, what you really want is to call dist.sample(). For many random distributions, the dependency of a random sample on the parameters is nontrivial and will not necessarily be backpropable.

However, as @Richard_wth pointed out, specifically for the normal distribution, it is possible through reparametrization to get a simple dependency on the location and scale parameters (mu and sigma).

In fact, in the implementation of tf.contrib.distributions.Normal (recently migrated to tf.distributions.Normal), that is exactly how sample is implemented:

def _sample_n(self, n, seed=None):
  ...
  sampled = random_ops.random_normal(shape=shape, mean=0., stddev=1., ...)
  return sampled * self.scale + self.loc

Consequently, if you provide scale and location parameters as tensors, then backpropagation will work correctly on those tensors.

Note that this backpropagation is inherently random: It will vary depending on the random draw of the normal Gaussian variable. However, in the long run (over many training examples), this is likely to work as you expect.

141

answered Oct 21 '22 01:10

Zvika

Related questions
                            
                                Use keras with tensorflow serving
                            
                                How to inject values into the middle of TensorFlow graph?
                            
                                PyCharm remote interpreter and Tensorflow -> can not import Cudart.so
                            
                                InvalidArgumentError: The node has inputs from different frames
                            
                                Keras + TensorFlow: “module 'tensorflow' has no attribute 'merge_all_summaries''”
                            
                                Tensor multiplication in Tensorflow
                            
                                Tensorflow Variables are Not Initialized using Between-graph Replication
                            
                                How to convert static_rnn inputs to dynamic_rnn inputs in tensorflow?
                            
                                TensorFlow operation 'tf.train.match_filenames_once ' not working
                            
                                how tf.space_to_depth() works in tensorflow?
                            
                                What is the most efficient way to compute a Kronecker Product in TensorFlow?
                            
                                Does tf.nn.l2_loss and tf.contrib.layers.l2_regularizer serve the same purpose of adding L2 regularization in tensorflow?
                            
                                keras add external trainable variable to graph
                            
                                Tensorboard error: 'Tensor' object has no attribute 'value'
                            
                                TypeError: unsupported callable using Dataset with estimator input_fn
                            
                                The result of fft in tensorflow is different from numpy
                            
                                reshaping image feed to tensorflow
                            
                                Use "Flatten" or "Reshape" to get 1D output of unknown input shape in keras
                            
                                cx_Freeze "no module named google" Error
                            
                                export Keras model to .pb file and optimize for inference gives random guess on Android

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With