In tensorflow, if one has two tensors x
and y
and one want to have the gradients of y
with respect to x
using tf.gradients(y,x)
. Then what one actually gets is :
gradient[n,m] = sum_ij d y[i,j]/ d x[n,m]
There is a sum over the indices of y, is there a way to avoid this implicit sum? To get the whole gradient tensor gradient[i,j,n,m]
?
Here is my work around just taking the derivative of each component (as also mentionned by @Yaroslav) and then packing them all together again in the case of rank 2 tensors (Matrices):
import tensorflow as tf
def twodtensor2list(tensor,m,n):
s = [[tf.slice(tensor,[j,i],[1,1]) for i in range(n)] for j in range(m)]
fs = []
for l in s:
fs.extend(l)
return fs
def grads_all_comp(y, shapey, x, shapex):
yl = twodtensor2list(y,shapey[0],shapey[1])
grads = [tf.gradients(yle,x)[0] for yle in yl]
gradsp = tf.pack(grads)
gradst = tf.reshape(gradsp,shape=(shapey[0],shapey[1],shapex[0],shapex[1]))
return gradst
Now grads_all_comp(y, shapey, x, shapex)
will output the rank 4 tensor in the desired format. It is a very inefficient way because everything needs to be sliced up and repacked together, so if someone finds a better I would be very interested to see it.
There isn't a way. TensorFlow 0.11 tf.gradients
implements standard reverse-mode AD which gives the derivative of a scalar quantity. You'd need to call tf.gradients
for each y[i,j]
separately
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With