Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

understanding tensorflow sequence_loss parameters

Tags:

The sequence_Loss module's source_code has three parameters that are required they list them as outputs, targets, and weights.

Outputs and targets are self explanatory, but I'm looking to better understand is what is the weight parameter?

The other thing I find confusing is that it states that the targets should be the same length as the outputs, what exactly do they mean by the length of a tensor? Especially if its a 3 dimensional tensor.

like image 987
TheM00s3 Avatar asked Dec 14 '16 03:12

TheM00s3


1 Answers

Think of the weights as a mask applied to the input tensor. In some NLP applications, we often have different sentence length for each sentence. In order to parallel/batch multiple instance sentences into a minibatch to feed into a neural net, people use a mask matrix to denotes which element in the the input tensor is actually a valid input. For instance, the weight can be a np.ones([batch, max_length]) that means all of the input elements are legit.

We can also use a matrix of the same shape as the labels such as np.asarray([[1,1,1,0],[1,1,0,0],[1,1,1,1]]) (we assume the labels shape is 3x4), then the crossEntropy of the first row last column will be masked out as 0.

You can also use weight to calculate weighted accumulation of cross entropy.

like image 120
Jon Avatar answered Sep 25 '22 16:09

Jon