For example, if I want to solve the MNIST classification problem, we have 10 output classes. With PyTorch, I would like to use the <code>torch.nn.CrossEntropyLoss</code> function. Do I have to format the targets so that they are one-hot encoded or can I simply use their class labels that come with the dataset?

<code>nn.CrossEntropyLoss</code> expects integer labels. What it does internally is that it doesn't end up one-hot encoding the class label at all, but uses the label to index into the output probability vector to calculate the loss should you decide to use this class as the final label. This small but important detail makes computing the loss easier and is the equivalent operation to performing one-hot encoding, measuring the output loss per output neuron as every value in the output layer would be zero with the exception of the neuron indexed at the target class. Therefore, there's no need to one-hot encode your data if you have the labels already provided. The documentation has some more insight on this: https://pytorch.org/docs/master/generated/torch.nn.CrossEntropyLoss.html. In the documentation you'll see <code>targets</code> which serves as part of the input parameters. These are your labels and they are described as: <img src="https://i.stack.imgur.com/1qRn5.png" alt="Targets"> This clearly shows how the input should be shaped and what is expected. If you in fact wanted to one-hot encode your data, you would need to use <code>torch.nn.functional.one_hot</code>. To best replicate what the cross entropy loss is doing under the hood, you'd also need <code>nn.functional.log_softmax</code> as the final output and you'd have to additionally write your own loss layer since none of the PyTorch layers use log softmax inputs and one-hot encoded targets. However, <code>nn.CrossEntropyLoss</code> combines both of these operations together and is preferred if your outputs are simply class labels so there is no need to do the conversion.

Is One-Hot Encoding required for using PyTorch's Cross Entropy Loss Function?

Tags:

python

deep-learning

computer-vision

pytorch

multilabel-classification

For example, if I want to solve the MNIST classification problem, we have 10 output classes. With PyTorch, I would like to use the torch.nn.CrossEntropyLoss function. Do I have to format the targets so that they are one-hot encoded or can I simply use their class labels that come with the dataset?

764

asked Jun 18 '20 18:06

Loay Sharaky

1 Answers

nn.CrossEntropyLoss expects integer labels. What it does internally is that it doesn't end up one-hot encoding the class label at all, but uses the label to index into the output probability vector to calculate the loss should you decide to use this class as the final label. This small but important detail makes computing the loss easier and is the equivalent operation to performing one-hot encoding, measuring the output loss per output neuron as every value in the output layer would be zero with the exception of the neuron indexed at the target class. Therefore, there's no need to one-hot encode your data if you have the labels already provided.

The documentation has some more insight on this: https://pytorch.org/docs/master/generated/torch.nn.CrossEntropyLoss.html. In the documentation you'll see targets which serves as part of the input parameters. These are your labels and they are described as:

Targets

This clearly shows how the input should be shaped and what is expected. If you in fact wanted to one-hot encode your data, you would need to use torch.nn.functional.one_hot. To best replicate what the cross entropy loss is doing under the hood, you'd also need nn.functional.log_softmax as the final output and you'd have to additionally write your own loss layer since none of the PyTorch layers use log softmax inputs and one-hot encoded targets. However, nn.CrossEntropyLoss combines both of these operations together and is preferred if your outputs are simply class labels so there is no need to do the conversion.

182

answered Oct 01 '22 20:10

rayryeng

Related questions
                            
                                How to receive and parse email with Cloud Functions?
                            
                                python sklearn get list of available hyper parameters for model
                            
                                Apply list of regex pattern on list python
                            
                                The smtplib.server.sendmail function in python raises UnicodeEncodeError: 'ascii' codec can't encode character
                            
                                Show text inside the tags BeautifulSoup
                            
                                How to build a simple RSS reader in Python 3.7?
                            
                                Cmake could not find boost_python
                            
                                Do JavaScript classes have a method equivalent to Python classes' __call__?
                            
                                How to suppress warning "Access to protected member" in pycharm method?
                            
                                Saving a TF2 keras model with custom signature defs
                            
                                How should I add a field containing a list of dictionaries in Marshmallow Python?
                            
                                speed of elementary mathematical operations in Numpy/Python: why is integer division slowest?
                            
                                How to convert a QByteArray to a python string in PySide2 [duplicate]
                            
                                Counting most common combination of values in dataframe column
                            
                                Tox 0% coverage
                            
                                Debugging Airflow Tasks with IDE tools?
                            
                                Could validation data be a generator in tensorflow.keras 2.0?
                            
                                How to load TF hub model from local system
                            
                                How to connect kafka topic with web endpoint using Faust Python package?
                            
                                How to overwrite data on an existing excel sheet while preserving all other sheets?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With