In this part of Tensorflow's tutorial here, they mentioned that they are training with teacher-forcing. To my knowledge, teacher-forcing involves feeding the target output into the model so that it converges faster. So I'm curious as to how this is done here? The real target is tar_real
, and as far as I can see, it is only used to calculate loss and accuracy. I'm curious as to how this code is implementing teacher-forcing?
Thanks in advance.
Each train_step takes in inp
and tar
objects from the dataset in the training loop. Teacher forcing is indeed used since the correct example from the dataset is always used as input during training (as opposed to the "incorrect" output from the previous training step):
tar
is split into tar_inp
, tar_real
(offset by one character)inp
, tar_inp
is used as input to the modelmodel
produces an output which is compared with tar_real
to calculate lossmodel output
is discarded (not used anymore)Teacher forcing is a procedure ... in which during training the model receives the ground truth output y(t) as input at time t+1. Page 372, Deep Learning, 2016.
Source: https://github.com/tensorflow/tensorflow/issues/30852#issuecomment-513528114
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With