Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

tensorflow: is differentiable indexing possible?

Tags:

tensorflow

Is differentiable indexing of an array possible with Tensorflow? More specifically, if I have a variable of floats that somehow transforms the indices of an array, can I obtain the gradient of the transformed array with respect to the variable? It seems like this should be differentiable, based on the gradient derivations from Spatial Transformer Networks (https://arxiv.org/pdf/1506.02025v3.pdf and https://github.com/tensorflow/models/blob/master/transformer/spatial_transformer.py). I have tried implementing this, and I am running into problems because I have to cast the transformed indices as integers before using tf.gather to transform the array, and it does not seem like gradients can pass through this. Would anyone have a suggestion on how to do this?

like image 979
user873261 Avatar asked Jul 28 '16 00:07

user873261


1 Answers

No, you cannot differentiate indexing in the traditional sense of "read the k-th element of the array".

But the trick here is to take a weighted sum of your index. The weights themselves should sum to 1, making it a probability distribution. Then it's your NN's job to figure out a high confidence probability distribution. This is how a forget gate in an LSTM works as well.

So instead of "reading the k-th element"... you are (hopefully) "taking the sum of ~0% of all the elements other than the k-th, and ~100% of the k-th element".

Of course, if your model that creates the weights isn't converging, then you'll be getting "~1/n of all n elements".

like image 110
Yaoshiang Avatar answered Jan 03 '23 23:01

Yaoshiang