Why would the loss decrease while the accuracy stays the same?

Tags:

I am training a normal feed-forward network on financial data of the last 90 days of a stock, and I am predicting whether the stock will go up or down on the next day. I am using binary cross entropy as my loss and standard SGD for the optimizer. When I train, the training and validation loss continue to go down as they should, but the accuracy and validation accuracy stay around the same.

Here's my model:

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
dense (Dense)                (None, 90, 256)           1536
_________________________________________________________________
elu (ELU)                    (None, 90, 256)           0
_________________________________________________________________
flatten (Flatten)            (None, 23040)             0
_________________________________________________________________
dropout (Dropout)            (None, 23040)             0
_________________________________________________________________
dense_1 (Dense)              (None, 1024)              23593984
_________________________________________________________________
elu_1 (ELU)                  (None, 1024)              0
_________________________________________________________________
dropout_1 (Dropout)          (None, 1024)              0
_________________________________________________________________
dense_2 (Dense)              (None, 512)               524800
_________________________________________________________________
elu_2 (ELU)                  (None, 512)               0
_________________________________________________________________
dropout_2 (Dropout)          (None, 512)               0
_________________________________________________________________
dense_3 (Dense)              (None, 512)               262656
_________________________________________________________________
elu_3 (ELU)                  (None, 512)               0
_________________________________________________________________
dropout_3 (Dropout)          (None, 512)               0
_________________________________________________________________
dense_4 (Dense)              (None, 256)               131328
_________________________________________________________________
activation (Activation)      (None, 256)               0
_________________________________________________________________
dense_5 (Dense)              (None, 2)                 514
_________________________________________________________________
activation_1 (Activation)    (None, 2)                 0
_________________________________________________________________
Total params: 24,514,818
Trainable params: 24,514,818
Non-trainable params: 0
_________________________________________________________________

I expect that either both losses should decrease while both accuracies increase, or the network will overfit and the validation loss and accuracy won't change much. Either way, shouldn't the loss and its corresponding accuracy value be directly linked and move inversely to each other?

Also, I notice that my validation loss is always less than my normal loss, which seems wrong to me.

Here's the loss (Normal: Blue, Validation: Green)

Loss

Here's the accuracy (Normal: Black, Validation: Yellow):

Accuracy

917

asked Aug 11 '19 20:08

Joe Fioti

1 Answers

Loss and accuracy are indeed connected, but the relationship is not so simple.

Loss drops but accuracy is about the same

Let's say we have 6 samples, our y_true could be:

[0, 0, 0, 1, 1, 1]

Furthermore, let's assume our network predicts following probabilities:

[0.9, 0.9, 0.9, 0.1, 0.1, 0.1]

This gives us loss equal to ~24.86 and accuracy equal to zero as every sample is wrong.

Now, after parameter updates via backprop, let's say new predictions would be:

[0.6, 0.6, 0.6, 0.4, 0.4, 0.4]

One can see those are better estimates of true distribution (loss for this example is 16.58), while accuracy didn't change and is still zero.

All in all, the relation is more complicated, network could fix its parameters for some examples, while destroying them for other which keeps accuracy about the same.

Why my network is unable to fit to the data?

Such situation usually occurs when your data is really complicated (or incomplete) and/or your model is too weak. Here both are the case, financial data prediction has a lot of hidden variables which your model cannot infer. Furthermore, dense layers are not the ones for this task; each day is dependent on the previous values, it is a perfect fit for Recurrent Neural Networks, you can find an article about LSTMs and how to use them here (and tons of others over the web).

161

answered Sep 18 '22 02:09

Szymon Maszke

Related questions
                            
                                Marshmallow field of any type
                            
                                Python error: global declared variable is not declared in the global scope
                            
                                How to remove toolbar buttons from matplotlib
                            
                                Cannot list FTP directory using ftplib – but FTP client works
                            
                                Python - Spread **kwargs into a dictionary similar to ES6 spread
                            
                                Airflow DAG Scheduled date is a week behind
                            
                                How to define specific number of convolutional kernels/filters in pytorch?
                            
                                How to load images larger than MAX_IMAGE_PIXELS with PIL?
                            
                                Unable to change axis titles in plotly surface plot in python
                            
                                Pyspark - Cumulative sum with reset condition
                            
                                Swap column values based on a condition in pandas
                            
                                How do I combine large csv files in python?
                            
                                Why can't I import functions in bert after pip install bert
                            
                                OAuth throws "missing code validator" in Google OAuth2
                            
                                Save Pandas data frame to Google Cloud bucket
                            
                                How to properly scale frequency axis in Fast Fourier Transform?
                            
                                How can I parse a Wikipedia XML dump with Python?
                            
                                Target array shape different to expected output using Tensorflow
                            
                                Skip Flask logging for one endpoint?
                            
                                python: pycodestyle (ex pep8) vs pylint strictness

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why would the loss decrease while the accuracy stays the same?

Tags:

python

tensorflow

deep-learning

keras