Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cross Entropy in PyTorch

I'm a bit confused by the cross entropy loss in PyTorch.

Considering this example:

import torch import torch.nn as nn from torch.autograd import Variable  output = Variable(torch.FloatTensor([0,0,0,1])).view(1, -1) target = Variable(torch.LongTensor([3]))  criterion = nn.CrossEntropyLoss() loss = criterion(output, target) print(loss) 

I would expect the loss to be 0. But I get:

Variable containing:  0.7437 [torch.FloatTensor of size 1] 

As far as I know cross entropy can be calculated like this:

enter image description here

But shouldn't be the result then 1*log(1) = 0 ?

I tried different inputs like one-hot encodings, but this doesn't work at all, so it seems the input shape of the loss function is okay.

I would be really grateful if someone could help me out and tell me where my mistake is.

Thanks in advance!

like image 975
MBT Avatar asked Mar 20 '18 17:03

MBT


People also ask

What is cross entropy in PyTorch?

This criterion computes the cross entropy loss between input and target. It is useful when training a classification problem with C classes. If provided, the optional argument weight should be a 1D Tensor assigning weight to each of the classes. This is particularly useful when you have an unbalanced training set.

What is cross entropy in machine learning?

The average number of bits required to send a message from distribution A to distribution B is referred to as cross-entropy. Cross entropy is a concept used in machine learning when algorithms are created to predict from the model. The construction of the model is based on a comparison of actual and expected results.

What is cross entropy in Tensorflow?

Cross entropy can be used to define a loss function (cost function) in machine learning and optimization. It is defined on probability distributions, not single values. It works for classification because classifier output is (often) a probability distribution over class labels.

What is cross entropy loss used for?

Cross entropy loss is a metric used to measure how well a classification model in machine learning performs. The loss (or error) is measured as a number between 0 and 1, with 0 being a perfect model. The goal is generally to get your model as close to 0 as possible.


2 Answers

In your example you are treating output [0, 0, 0, 1] as probabilities as required by the mathematical definition of cross entropy. But PyTorch treats them as outputs, that don’t need to sum to 1, and need to be first converted into probabilities for which it uses the softmax function.

So H(p, q) becomes:

H(p, softmax(output)) 

Translating the output [0, 0, 0, 1] into probabilities:

softmax([0, 0, 0, 1]) = [0.1749, 0.1749, 0.1749, 0.4754] 

whence:

-log(0.4754) = 0.7437 
like image 96
Old Dog Avatar answered Sep 23 '22 08:09

Old Dog


Your understanding is correct but pytorch doesn't compute cross entropy in that way. Pytorch uses the following formula.

loss(x, class) = -log(exp(x[class]) / (\sum_j exp(x[j])))                = -x[class] + log(\sum_j exp(x[j])) 

Since, in your scenario, x = [0, 0, 0, 1] and class = 3, if you evaluate the above expression, you would get:

loss(x, class) = -1 + log(exp(0) + exp(0) + exp(0) + exp(1))                = 0.7437 

Pytorch considers natural logarithm.

like image 20
Wasi Ahmad Avatar answered Sep 21 '22 08:09

Wasi Ahmad