Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow gradients: without automatic implicit sum

In tensorflow, if one has two tensors x and y and one want to have the gradients of y with respect to x using tf.gradients(y,x). Then what one actually gets is :

gradient[n,m] = sum_ij d y[i,j]/ d x[n,m]

There is a sum over the indices of y, is there a way to avoid this implicit sum? To get the whole gradient tensor gradient[i,j,n,m]?

like image 200
patapouf_ai Avatar asked Oct 12 '16 07:10

patapouf_ai


2 Answers

Here is my work around just taking the derivative of each component (as also mentionned by @Yaroslav) and then packing them all together again in the case of rank 2 tensors (Matrices):

import tensorflow as tf

def twodtensor2list(tensor,m,n):
    s = [[tf.slice(tensor,[j,i],[1,1]) for i in range(n)] for j in range(m)]
    fs = []
    for l in s:
        fs.extend(l)
    return fs

def grads_all_comp(y, shapey, x, shapex):
    yl = twodtensor2list(y,shapey[0],shapey[1])
    grads = [tf.gradients(yle,x)[0] for yle in yl]
    gradsp = tf.pack(grads)
    gradst = tf.reshape(gradsp,shape=(shapey[0],shapey[1],shapex[0],shapex[1]))
    return gradst

Now grads_all_comp(y, shapey, x, shapex) will output the rank 4 tensor in the desired format. It is a very inefficient way because everything needs to be sliced up and repacked together, so if someone finds a better I would be very interested to see it.

like image 161
patapouf_ai Avatar answered Nov 11 '22 21:11

patapouf_ai


There isn't a way. TensorFlow 0.11 tf.gradients implements standard reverse-mode AD which gives the derivative of a scalar quantity. You'd need to call tf.gradients for each y[i,j] separately

like image 23
Yaroslav Bulatov Avatar answered Nov 11 '22 19:11

Yaroslav Bulatov