keras giving same loss on every epoch

I am newbie to keras.

I ran it on a dataset where my objective was to reduce the logloss. For every epoch it is giving me the same loss value. I am confused whether i am on the right track or not.

For example:

Epoch 1/5
91456/91456 [==============================] - 142s - loss: 3.8019 - val_loss: 3.8278
Epoch 2/5
91456/91456 [==============================] - 139s - loss: 3.8019 - val_loss: 3.8278
Epoch 3/5
91456/91456 [==============================] - 143s - loss: 3.8019 - val_loss: 3.8278
Epoch 4/5
91456/91456 [==============================] - 142s - loss: 3.8019 - val_loss: 3.8278
Epoch 5/5
91456/91456 [==============================] - 142s - loss: 3.8019 - val_loss: 3.8278

Here 3.8019 is same in every epoch. It is supposed to be less.

I ran into this issue as well. After much deliberation, I figured out that it was my activation function on my output layer.

I had this model to predict a binary outcome:

model = Sequential()
model.add(Dense(1, activation='softmax'))

and I needed this for binary cross entropy

model = Sequential()
model.add(Dense(1, activation='sigmoid'))

I would look towards the problem you are trying to solve and the output needed to ensure that your activation functions are what they need to be.

Try decreasing your learning rate to 0.0001 and use Adam. What is your learning rate?

It's actually not clear to see if its the problem of learning rate or model complexity, could you explain a bit further with these instructions:

  1. What is your data size, what is it about?
  2. What is your model's complexity? We can compare your complexity with analysing your data. If your data is too big, you need more complex model.
  3. Did you normalize your outputs? For inputs, it couldn't be a big deal since not-normalization gives results, but if your outputs are some numbers bigger than 1, you need to normalize your data. If you check for your last layer's activation function of model, they're usually sigmoid, softmax, tanh that frequently squeezes your output to 0-1 and -1 - 1. You need to normalize your data according to your last activation function, and then reverse multiplying them for getting real-life result.

Since you're new to deep learning, these advices are enough to check if there's a problem on your model. Can you please check them and reply?

