understanding tensorflow sequence_loss parameters

Question

The sequence_Loss module's source_code has three parameters that are required they list them as outputs, targets, and weights.

Outputs and targets are self explanatory, but I'm looking to better understand is what is the weight parameter?

The other thing I find confusing is that it states that the targets should be the same length as the outputs, what exactly do they mean by the length of a tensor? Especially if its a 3 dimensional tensor.

Jon · Accepted Answer

Think of the weights as a mask applied to the input tensor. In some NLP applications, we often have different sentence length for each sentence. In order to parallel/batch multiple instance sentences into a minibatch to feed into a neural net, people use a mask matrix to denotes which element in the the input tensor is actually a valid input. For instance, the weight can be a np.ones([batch, max_length]) that means all of the input elements are legit.

We can also use a matrix of the same shape as the labels such as np.asarray([[1,1,1,0],[1,1,0,0],[1,1,1,1]]) (we assume the labels shape is 3x4), then the crossEntropy of the first row last column will be masked out as 0.

You can also use weight to calculate weighted accumulation of cross entropy.

understanding tensorflow sequence_loss parameters

Tags:

TheM00s3

1 Answers

Jon

Recent Activity

Donate For Us

understanding tensorflow sequence_loss parameters

Tags:

TheM00s3

1 Answers

Jon

Related questions

Recent Activity

Donate For Us