CNN Loss stuck at 2.302 (ln(10))

Question

I am trying to model the Neural Net for solving CIFAR-10 dataset, but there is this very odd problem I am facing, I have tried over 6 different CNN architecture and with many different CNN hyperparameters and fully connected #neurons values, but all seem to fail with loss of 2.302 and corresponding accuracy of 0.0625, why does this happen, what property of CNN or neural net makes this, I also tried dropout, l2_norm, different kernel sizes, different padding in CNN and Max Pool. I don't understand why the loss gets stuck over such an odd number?

I am implementing this using tensorflow, and I have tried softmax layer + cross_entropy_loss, and without_softmax_layer + sparse_cross_entropy_loss. Is it the plateau the neural net loss function is stuck at?

floko · Accepted Answer

This seems like you accidentally applied a non-linearity/activation function to the last layer of your network. Keep in mind that the cross entropy works upon values within a range between 0 and 1. As you "force" your output to this range automatically by applying the softmax function just before computing the cross entropy, you should just "apply" a linear activation function (just don't add any).

By the way, the value of 2.302 is not random by any chance. It is rather the result of the softmax loss being -ln(0.1) when you assume that all 10 classes (CIFAR-10) initially got the same expected diffuse probability of 0.1. Check out the explanation by Andrej Karpathy: http://cs231n.github.io/neural-networks-3/

CNN Loss stuck at 2.302 (ln(10))

Tags:

artificial-intelligence

tensorflow

deep-learning

conv-neural-network

thelogicalkoan

1 Answers

floko

Recent Activity

Donate For Us

CNN Loss stuck at 2.302 (ln(10))

Tags:

artificial-intelligence

tensorflow

deep-learning

conv-neural-network

thelogicalkoan

1 Answers

floko

Related questions

Recent Activity

Donate For Us