loss, val_loss, acc and val_acc do not update at all over epochs

Tags:

I created an LSTM network for sequence classification (binary) where each sample has 25 timesteps and 4 features. The following is my keras network topology:

enter image description here

Above, the activation layer after Dense layer uses softmax function. I used binary_crossentropy for loss function and Adam as the optimizer to compile the keras model. Trained the model with batch_size=256, shuffle=True and validation_split=0.05, The following is the training log:

Train on 618196 samples, validate on 32537 samples
2017-09-15 01:23:34.407434: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-09-15 01:23:34.407719: I tensorflow/core/common_runtime/gpu/gpu_device.cc:955] Found device 0 with properties: 
name: GeForce GTX 1050
major: 6 minor: 1 memoryClockRate (GHz) 1.493
pciBusID 0000:01:00.0
Total memory: 3.95GiB
Free memory: 3.47GiB
2017-09-15 01:23:34.407735: I tensorflow/core/common_runtime/gpu/gpu_device.cc:976] DMA: 0 
2017-09-15 01:23:34.407757: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 0:   Y 
2017-09-15 01:23:34.407764: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0)
618196/618196 [==============================] - 139s - loss: 4.3489 - acc: 0.7302 - val_loss: 4.4316 - val_acc: 0.7251
Epoch 2/50
618196/618196 [==============================] - 132s - loss: 4.3489 - acc: 0.7302 - val_loss: 4.4316 - val_acc: 0.7251
Epoch 3/50
618196/618196 [==============================] - 134s - loss: 4.3489 - acc: 0.7302 - val_loss: 4.4316 - val_acc: 0.7251
Epoch 4/50
618196/618196 [==============================] - 133s - loss: 4.3489 - acc: 0.7302 - val_loss: 4.4316 - val_acc: 0.7251
Epoch 5/50
618196/618196 [==============================] - 132s - loss: 4.3489 - acc: 0.7302 - val_loss: 4.4316 - val_acc: 0.7251
Epoch 6/50
618196/618196 [==============================] - 132s - loss: 4.3489 - acc: 0.7302 - val_loss: 4.4316 - val_acc: 0.7251
Epoch 7/50
618196/618196 [==============================] - 132s - loss: 4.3489 - acc: 0.7302 - val_loss: 4.4316 - val_acc: 0.7251
Epoch 8/50
618196/618196 [==============================] - 132s - loss: 4.3489 - acc: 0.7302 - val_loss: 4.4316 - val_acc: 0.7251

... and so on through 50 epochs with same numbers

So far, I have also tried using rmsprop, nadam optimizers and batch_size(s) 128, 512, 1024 but the loss, val_loss, acc, val_acc always remained same throughout all epochs, yielding accuracy in the range of 0.72 to 0.74 in my each attempt.

442

asked Sep 15 '17 13:09

Kaushik Shrestha

1 Answers

The softmax activation makes sure the sum of the outputs is 1. It's useful for assuring that only one class among many classes will be output.

Since you have only 1 output (only one class), it's certainly a bad idea. You're probably ending up with 1 as result for all samples.

Use sigmoid instead. It goes well with binary_crossentropy.

124

answered Oct 21 '22 07:10

Daniel Möller

Related questions
                            
                                Object detection using Keras : simple way for faster R-CNN or YOLO
                            
                                What is the difference between backpropagation and reverse-mode autodiff?
                            
                                How to acquire tf.data.dataset's shape?
                            
                                Can I share weights between keras layers but have other parameters differ?
                            
                                how to calculate a Mobilenet FLOPs in Keras
                            
                                How to get labels ids in Keras when training on multiple classes?
                            
                                How to retrieve float_val from a PredictResponse object?
                            
                                tensorflow:Can save best model only with val_acc available, skipping
                            
                                What does global pooling do?
                            
                                Average weights in keras models
                            
                                Saving PyTorch model with no access to model class code
                            
                                How can I implement a weighted cross entropy loss in tensorflow using sparse_softmax_cross_entropy_with_logits
                            
                                Understanding accumulated gradients in PyTorch
                            
                                Why does one not use IOU for training?
                            
                                Xavier and he_normal initialization difference
                            
                                PyTorch NotImplementedError in forward
                            
                                What does the copy_initial_weights documentation mean in the higher library for Pytorch?
                            
                                Tensorflow CNN training images are all different sizes
                            
                                Computational Complexity of Self-Attention in the Transformer Model
                            
                                How to implement multi-class semantic segmentation?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

loss, val_loss, acc and val_acc do not update at all over epochs

Tags:

classification

deep-learning

keras

Kaushik Shrestha

People also ask

1 Answers

Daniel Möller

Recent Activity

Donate For Us