I am new to machine learning and am currently trying to train a convolutional neural net with 3 convolutional layers and 1 fully connected layer. I am using a dropout probability of 25% and a learning rate of 0.0001. I have 6000 150x200 training images and 13 output classes. I am using tensorflow. I am noticing a trend where my loss steadily decreases, but my accuracy increases only slightly and then drops back down again. My training images are the blue lines and my validation images are the orange lines. The x axis is steps.
I am wondering if there is a something I am not understanding or what could be possible causes of this phenomenon? From the material I have read, I assumed low loss meant high accuracy. Here is my loss function.
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(pred, y))
But, if both loss and accuracy are low, it means the model makes small errors in most of the data. However, if they're both high, it makes big errors in some of the data. Finally, if the accuracy is high and the loss is low, then the model makes small errors on just some of the data, which would be the ideal case.
Most of the time we would observe that accuracy increases with the decrease in loss -- but this is not always the case. Accuracy and loss have different definitions and measure different things. They often appear to be inversely proportional but there is no mathematical relationship between these two metrics.
If the loss increases and the accuracy increase too is because your regularization techniques are working well and you're fighting the overfitting problem. This is true only if the loss, then, starts to decrease whilst the accuracy continues to increase.
By definition, Accuracy score is the number of correct predictions obtained. Loss values are the values indicating the difference from the desired target state(s).
That is because Loss and Accuracy are two totally different things (well at least logically)!
Consider an example where you have defined loss
as:
loss = (1-accuracy)
In this case when you try to minimize loss
, accuracy
increases automatically.
Now consider another example where you define loss
as:
loss = average(prediction_probabilities)
Though it does not make any sense, it technically is still a valid loss function and your weights
are still tuned in order to minimize such loss
.
But as you can see, in this case, there is no relation between loss
and accuracy
so you cannot expect both to increase/decrease at the same time.
Note: Loss
will always be minimized (thus your loss
decreases after each iteration)!
PS: Please update your question with the loss
function you are trying to minimize.
softmax_cross_entropy_with_logits() and the accuracy are two different concepts with different formula definitions. Under normal cases, we could expect to get higher accuracy by minimizing softmax cross entropy, but they are calculated in different ways, so we couldn't expect them to be always increased or decreased in a synchronized way.
We use softmax cross entropy in CNN because it's effective for neural network training. If we use the loss = (1-accuracy) as loss function, it's very difficult to get better result through adjusting weights for our CNN neural network with our current mature backprogation training solutions, I really did it and confirmed this conclusion, you also could try it by yourself. Maybe it's caused by our current poor backprogation training solution, maybe it's caused by our neurons' definition(we need change it to some other types neuron?), but anyway, currently, using the accuracy in the loss function is not an effective way for neuron network training, so just use softmax_cross_entropy_with_logits() as those AI scientists told us, they already confirmed this way is effective, for others ways, we don't know them yet.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With