What is the difference between binary crossentropy and binary crossentropy with logits in keras?

Question

In keras backend we have a flag with_logits in K.binary_crossentropy. What is the difference between normal binary crossentropy and binary crossentropy with logits? Suppose I am using a seq2seq model and my output sequence is of type 100111100011101.

What should I use for an recursive LSTM or RNN to learn from this data provided I am giving a similar sequence in the input along with timesteps?

Maxim · Accepted Answer

This depends on whether or not you have a sigmoid layer just before the loss function.

If there is a sigmoid layer, it will squeeze the class scores into probabilities, in this case from_logits should be False. The loss function will transform the probabilities into logits, because that's what tf.nn.sigmoid_cross_entropy_with_logits expects.

If the output is already a logit (i.e. the raw score), pass from_logits=True, no transformation will be made.

Both options are possible and the choice depends on your network architecture. By the way if the term logit seems scary, take a look at this question which discusses it in detail.

What is the difference between binary crossentropy and binary crossentropy with logits in keras?

Tags:

python

machine-learning

keras

lstm

rnn

Subham Mukherjee

1 Answers

Maxim

Recent Activity

Donate For Us

What is the difference between binary crossentropy and binary crossentropy with logits in keras?

Tags:

python

machine-learning

keras

lstm

rnn

Subham Mukherjee

1 Answers

Maxim

Related questions

Recent Activity

Donate For Us