Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use of grads_ys parameter in tf.gradients - TensorFlow

Tags:

tensorflow

I want to understand the grad_ys paramter in tf.gradients. I've seen it used like a multiplyer of the true gradient but its not crear in the definition. Mathematically how would the whole expression look like?

like image 355
Cristian Garcia Avatar asked Feb 22 '17 18:02

Cristian Garcia


1 Answers

Edit: better clarification of notation is here

ys are summed up to make a single scalar y, and then tf.gradients computes dy/dx where x represents variables from xs

grad_ys represent the "starting" backprop value. They are 1 by default, but a different value can be when you want to chain several tf.gradients calls together -- you can pass in the output of previous tf.gradients call into grad_ys to continue the backprop flow.

For formal definition, look at the chained expression in Reverse Accumulation here: https://en.wikipedia.org/wiki/Automatic_differentiation#Reverse_accumulation

The term corresponding to dy/dw3 * dw3/dw2 in TensorFlow is a vector of 1's (think of it as if TensorFlow wraps cost with a dummy identity op). When you specify grad_ys this term is replaced with grad_ys instead of vector of 1s

enter image description here

like image 138
Yaroslav Bulatov Avatar answered Oct 19 '22 16:10

Yaroslav Bulatov