I'm training a network which has multiple losses and both creating and feeding the data into my network using a generator.
I've checked the structure of the data and it looks fine generally and it also trains pretty much as expected the majority of the time, however at a random epoch almost every time, the the training loss for every prediction suddenly jumps from say
# End of epoch 3
loss: 2.8845
to
# Beginning of epoch 4
loss: 1.1921e-07
I thought it could be the data, however, from what I can tell the data is generally fine and it's even more suspicious because this will happen at a random epoch (could be because of a random data point chosen during SGD?) but will persist throughout the rest of training. As in if at epoch 3, the training loss decreases to 1.1921e-07
then it will continue this way in epoch 4, epoch 5, etc.
However, there are times when it reaches epoch 5 and hasn't done this yet and then might do it at epoch 6 or 7.
Is there any viable reason outside of the data that could cause this? Could it even happen that a few fudgy data points causes this so fast?
Thanks
EDIT:
Results:
300/300 [==============================] - 339s - loss: 3.2912 - loss_1: 1.8683 - loss_2: 9.1352 - loss_3: 5.9845 -
val_loss: 1.1921e-07 - val_loss_1: 1.1921e-07 - val_loss_2: 1.1921e-07 - val_loss_3: 1.1921e-07
The next epochs after this all have trainig loss 1.1921e-07
Not entirely sure how satisfactory this is as an answer but my findings seem to show that using multiple categorical_crossentropy loss's together seems to result in a super unstable network? Swapping this out for other loss functions fixes the problem with the data remaining unchanged.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With