I am very confused by how Pytorch deals with one-hot vectors. In this tutorial, the neural network will generate a one-hot vector as its output. As far as I understand, the schematic structure of the neural network in the tutorial should be like:
However, the labels
are not in one-hot vector format. I get the following size
print(labels.size()) print(outputs.size()) output>>> torch.Size([4]) output>>> torch.Size([4, 10])
Miraculously, I they pass the outputs
and labels
to criterion=CrossEntropyLoss()
, there's no error at all.
loss = criterion(outputs, labels) # How come it has no error?
Maybe pytorch automatically convert the labels
to one-hot vector form. So, I try to convert labels to one-hot vector before passing it to the loss function.
def to_one_hot_vector(num_class, label): b = np.zeros((label.shape[0], num_class)) b[np.arange(label.shape[0]), label] = 1 return b labels_one_hot = to_one_hot_vector(10,labels) labels_one_hot = torch.Tensor(labels_one_hot) labels_one_hot = labels_one_hot.type(torch.LongTensor) loss = criterion(outputs, labels_one_hot) # Now it gives me error
However, I got the following error
RuntimeError: multi-target not supported at /opt/pytorch/pytorch/aten/src/THCUNN/generic/ClassNLLCriterion.cu:15
So, one-hot vectors are not supported in Pytorch
? How does Pytorch
calculates the cross entropy
for the two tensor outputs = [1,0,0],[0,0,1]
and labels = [0,2]
? It doesn't make sense to me at all at the moment.
In the above example, we try to implement the one hot() encoding function as shown here first; we import all required packages such as a torch. After that, we created a tenor by using the rand() function as shown, and finally, we applied the one hot() function with the argmax() function as shown.
One hot encoding can be defined as the essential process of converting the categorical data variables to be provided to machine and deep learning algorithms which in turn improve predictions as well as classification accuracy of a model.
What is * ? For . view() pytorch expects the new shape to be provided by individual int arguments (represented in the doc as *shape ). The asterisk ( * ) can be used in python to unpack a list into its individual elements, thus passing to view the correct form of input arguments it expects.
PyTorch states in its documentation for CrossEntropyLoss
that
This criterion expects a class index (0 to C-1) as the target for each value of a 1D tensor of size minibatch
In other words, it has your to_one_hot_vector
function conceptually built in CEL
and does not expose the one-hot API. Notice that one-hot vectors are memory inefficient compared to storing class labels.
If you are given one-hot vectors and need to go to class labels format (for instance to be compatible with CEL
), you can use argmax
like below:
import torch labels = torch.tensor([1, 2, 3, 5]) one_hot = torch.zeros(4, 6) one_hot[torch.arange(4), labels] = 1 reverted = torch.argmax(one_hot, dim=1) assert (labels == reverted).all().item()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With