keras loss jumps to zero randomly at the start of a new epoch

Question

I'm training a network which has multiple losses and both creating and feeding the data into my network using a generator.

I've checked the structure of the data and it looks fine generally and it also trains pretty much as expected the majority of the time, however at a random epoch almost every time, the the training loss for every prediction suddenly jumps from say

# End of epoch 3
loss: 2.8845

to

# Beginning of epoch 4
loss: 1.1921e-07

I thought it could be the data, however, from what I can tell the data is generally fine and it's even more suspicious because this will happen at a random epoch (could be because of a random data point chosen during SGD?) but will persist throughout the rest of training. As in if at epoch 3, the training loss decreases to 1.1921e-07 then it will continue this way in epoch 4, epoch 5, etc.

However, there are times when it reaches epoch 5 and hasn't done this yet and then might do it at epoch 6 or 7.

Is there any viable reason outside of the data that could cause this? Could it even happen that a few fudgy data points causes this so fast?

Thanks

EDIT:

Results:

300/300 [==============================] - 339s - loss: 3.2912 - loss_1: 1.8683 - loss_2: 9.1352 - loss_3: 5.9845 - 
val_loss: 1.1921e-07 - val_loss_1: 1.1921e-07 - val_loss_2: 1.1921e-07 - val_loss_3: 1.1921e-07

The next epochs after this all have trainig loss 1.1921e-07

tryingtolearn · Accepted Answer

Not entirely sure how satisfactory this is as an answer but my findings seem to show that using multiple categorical_crossentropy loss's together seems to result in a super unstable network? Swapping this out for other loss functions fixes the problem with the data remaining unchanged.

keras loss jumps to zero randomly at the start of a new epoch

Tags:

neural-network

keras

gradient-descent

training-data

tryingtolearn

1 Answers

tryingtolearn

Recent Activity

Donate For Us

keras loss jumps to zero randomly at the start of a new epoch

Tags:

neural-network

keras

gradient-descent

training-data

tryingtolearn

1 Answers

tryingtolearn

Related questions

Recent Activity

Donate For Us