Caffe sigmoid cross entropy loss

Question

I am using the sigmoid cross entropy loss function for a multilabel classification problem as laid out by this tutorial. However, in both their results on the tutorial and my results, the output predictions are in the range (-Inf, Inf), while the range of a sigmoid is [0, 1]. Is the sigmoid only processed in the backprop? That is, shouldn't a forward pass squash the output?

Shai · Accepted Answer

In this example the input to the "SigmoidCrossEntropyLoss" layer is the output of a fully-connect layer. Indeed there are no constraints on the values of the outputs of an "InnerProduct" layer and they can be in range [-inf, inf].
However, if you examine carefully the "SigmoidCrossEntropyLoss" you'll notice that it includes a "Sigmoid" layer inside -- to ensure stable gradient estimation.
Therefore, at test time, you should replace the "SigmoidCrossEntropyLoss" with a simple "Sigmoid" layer to output per-class predictions.

Caffe sigmoid cross entropy loss

Tags:

machine-learning

neural-network

deep-learning

caffe

marcman

1 Answers

Shai

Recent Activity

Donate For Us

Caffe sigmoid cross entropy loss

Tags:

machine-learning

neural-network

deep-learning

caffe

marcman

1 Answers

Shai

Related questions

Recent Activity

Donate For Us