I built a deep learning model which accept image of size 250*250*3 and output 62500(250*250) binary vector which contains 0s in pixels that represent the background and 1s in pixels which represents ROI. My model is based on DenseNet121 but when i use softmax as an activation function in last layer and categorical cross entropy loss function , the loss is nan. What is the best loss and activation function that i can use it in my model? What is the difference between binary cross entropy and categorical cross entropy loss function? Thanks in advance.
What is the best loss and activation function that i can use it in my model?
binary_crossentropy
because every output is independent, not mutually exclusive and can take values 0 or 1, use sigmoid
in the last layer.Check this interesting question/answer
What is the difference between binary cross entropy and categorical cross entropy loss function?
Here is a good set of answers to that question.
Edit 1: My bad, use binary_crossentropy
.
After a quick look at the code (again) I can see that keras uses:
binary_crossentropy
-> tf.nn.sigmoid_cross_entropy_with_logits
(From tf docs): Measures the probability error in discrete classification tasks in which each class is independent and not mutually exclusive. For instance, one could perform multilabel classification where a picture can contain both an elephant and a dog at the same time.
categorical_crossentropy
-> tf.nn.softmax_cross_entropy_with_logits
(From tf docs): Measures the probability error in discrete classification tasks in which the classes are mutually exclusive (each entry is in exactly one class). For example, each CIFAR-10 image is labeled with one and only one label: an image can be a dog or a truck, but not both.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With