I am training a model for segmenting machine printed text from the images. The images might contain barcodes and handwritten text also. Ground truths images are processed so that 0 represents machine print and 1 represents the remaining. And I am using 5 layer CNN with dilation which outputs 2 maps in the end.
And my loss is calculated as follows:
def loss(logits, labels):
logits = tf.reshape(logits, [-1, 2])
labels = tf.reshape(labels, [-1])
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=labels)
cross_entropy_mean = tf.reduce_mean(cross_entropy, name='cross_entropy')
And I have some images which contain only handwritten text and their corresponding ground truths are blank pages which are represented by 1s.
When I train the model, for these images I am getting a loss of 0 and training accuracy of 100%. Is this correct? How can this loss be zero? For other images which contain barcodes or machine print, am getting some loss and they are converging properly.
And when I test this model, barcodes are correctly ignored. But it outputs both machine print and handwritten text where I need only machine print.
Can someone guide me on where I am going wrong, please!
UPDATE 1:
I was using a learning rate of 0.01 before and changing it to 0.0001 gave me some loss and it seems to converge but not very well. But, then how a high learning rate will give a loss of 0?
When I use the same model in Caffe with learning rate of 0.01 it gave some loss and it converges well compared to in Tensorflow.
Your loss calculation looks fine but a loss of zero is weird in your case. Have you tried playing with the learning rate? Maybe decrease it. I have encountered weird loss values and decreasing the learning rate helped me.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With