Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keras RNN loss does not decrease over epoch

I built a RNN using Keras. The RNN is used to solve a regression problem:

def RNN_keras(feat_num, timestep_num=100):
    model = Sequential()
    model.add(BatchNormalization(input_shape=(timestep_num, feat_num)))
    model.add(LSTM(input_shape=(timestep_num, feat_num), output_dim=512, activation='relu', return_sequences=True))
    model.add(BatchNormalization())  
    model.add(LSTM(output_dim=128, activation='relu', return_sequences=True))
    model.add(BatchNormalization())
    model.add(TimeDistributed(Dense(output_dim=1, activation='relu'))) # sequence labeling

    rmsprop = RMSprop(lr=0.00001, rho=0.9, epsilon=1e-08)
    model.compile(loss='mean_squared_error',
                  optimizer=rmsprop,
                  metrics=['mean_squared_error'])
    return model

The whole process looks fine. But the loss stays the exact same over epochs.

61267 in the training set
6808 in the test set

Building training input vectors ...
888 unique feature names
The length of each vector will be 888
Using TensorFlow backend.

Build model...

# Each batch has 1280 examples
# The training data are shuffled at the beginning of each epoch.

****** Iterating over each batch of the training data ******
Epoch 1/3 : Batch 1/48 | loss = 11011073.000000 | root_mean_squared_error = 3318.232910
Epoch 1/3 : Batch 2/48 | loss = 620.271667 | root_mean_squared_error = 24.904161
Epoch 1/3 : Batch 3/48 | loss = 620.068665 | root_mean_squared_error = 24.900017
......
Epoch 1/3 : Batch 47/48 | loss = 618.046448 | root_mean_squared_error = 24.859678
Epoch 1/3 : Batch 48/48 | loss = 652.977051 | root_mean_squared_error = 25.552946
****** Epoch 1: RMSD(training) = 24.897174 

Epoch 2/3 : Batch 1/48 | loss = 607.372620 | root_mean_squared_error = 24.644049
Epoch 2/3 : Batch 2/48 | loss = 599.667786 | root_mean_squared_error = 24.487448
Epoch 2/3 : Batch 3/48 | loss = 621.368103 | root_mean_squared_error = 24.926300
......
Epoch 2/3 : Batch 47/48 | loss = 620.133667 | root_mean_squared_error = 24.901398
Epoch 2/3 : Batch 48/48 | loss = 639.971924 | root_mean_squared_error = 25.297264
****** Epoch 2: RMSD(training) = 24.897174 

Epoch 3/3 : Batch 1/48 | loss = 651.519836 | root_mean_squared_error = 25.523636
Epoch 3/3 : Batch 2/48 | loss = 673.582581 | root_mean_squared_error = 25.952084
Epoch 3/3 : Batch 3/48 | loss = 613.930054 | root_mean_squared_error = 24.776562
......
Epoch 3/3 : Batch 47/48 | loss = 624.460327 | root_mean_squared_error = 24.988203
Epoch 3/3 : Batch 48/48 | loss = 629.544250 | root_mean_squared_error = 25.090448
****** Epoch 3: RMSD(training) = 24.897174 

I do NOT think it is normal. Do I miss something?


UPDATE: I find that all predictions are always zero after all epochs. This is the reason why all RMSDs are all the same because the predictions are all the same, i.e. 0. I checked the training y. It only contains just a few zeros. So it is not due to data imbalance.

So now I am thinking if it is because of the layers and activation that I am using.

like image 664
Munichong Avatar asked Sep 03 '16 17:09

Munichong


People also ask

Why is my validation loss not increasing?

Usually when a model overfits, validation loss goes up and training loss goes down from the point of overfitting. But for my case, training loss still goes down but validation loss stays at same level. Hence validation accuracy also stays at same level but training accuracy goes up.

How do you minimize validation losses?

Solutions to this are to decrease your network size, or to increase dropout. For example you could try dropout of 0.5 and so on. If your training/validation loss are about equal then your model is underfitting. Increase the size of your model (either number of layers or the raw number of neurons per layer)

Why is validation loss less than training loss?

The regularization terms are only applied while training the model on the training set, inflating the training loss. During validation and testing, your loss function only comprises prediction error, resulting in a generally lower loss than the training set.


2 Answers

Your RNN functions seems to be ok.

The speed of reduction in loss depends on optimizer and learning rate.

Any how you are using decay rate 0.9. try with bigger learning rate, any how it is going to decrease with 0.9 rate.

Try out other optimizers with different learning rates Other optimizers available with keras: https://keras.io/optimizers/

Many times, some optimizers work well on some data sets while some may fails.

like image 129
Pramod Patil Avatar answered Sep 28 '22 05:09

Pramod Patil


Have you tried changing activation function from relu to softmax?

Relu activation has the tendency to diverge. However, if initializing the weight with eigenmatrix may result in a better convergence.

like image 45
theblackcat102 Avatar answered Sep 28 '22 05:09

theblackcat102