I am getting NaN when I attempt to use the sparse_softmax_cross_entropy_with_logits loss function in tensorflow. I have a simple network, something like:
layer = tf.nn.relu(tf.matmul(inputs, W1) + b1)
layer = tf.nn.relu(tf.matmul(layer, W2) + b2)
logits = tf.matmul(inputs, W3) + b3
loss = tf.sparse_softmax_cross_entropy_with_logits(logits, labels)
I have many classes (~10000), so I imagine I am getting NaN because the logit corresponding to correct class in at least one of my examples got truncated to zero. Is there a way to avoid this?
It actually turns out that some of my labels were out of range (e.g. a label of 14000, when my logits matrix is just 150 x 10000). It turns out this results in a NaN rather than an error.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With