Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pytorch inputs for nn.CrossEntropyLoss()

I am trying to perform a Logistic Regression in PyTorch on a simple 0,1 labelled dataset. The criterion or loss is defined as: criterion = nn.CrossEntropyLoss(). The model is: model = LogisticRegression(1,2)

I have a data point which is a pair: dat = (-3.5, 0), the first element is the datapoint and the second is the corresponding label.
Then I convert the first element of the input to a tensor: tensor_input = torch.Tensor([dat[0]]).
Then I apply the model to the tensor_input: outputs = model(tensor_input).
Then I convert the label to a tensor: tensor_label = torch.Tensor([dat[1]]).
Now, when I try to do this, the thing breaks: loss = criterion(outputs, tensor_label). It gives and error: RuntimeError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

import torch
import torch.nn as nn

class LogisticRegression(nn.Module):
    def __init__(self, input_size, num_classes):
        super(LogisticRegression, self).__init__()
        self.linear = nn.Linear(input_size, num_classes) 

    def forward(self, x):
        out = self.linear(x)
        return out

model = LogisticRegression(1,2)
criterion = nn.CrossEntropyLoss()
dat = (-3.5,0)
tensor_input = torch.Tensor([dat[0]])
outputs = binary_model(tensor_input)
tensor_label = torch.Tensor([dat[1]])
loss = criterion(outputs, tensor_label)

I can't for the life of me figure it out.

like image 559
Amadej Kristjan Kocbek Avatar asked Dec 26 '18 19:12

Amadej Kristjan Kocbek


2 Answers

For the most part, the PyTorch documentation does an amazing job to explain the different functions; they usually do include expected input dimensions, as well as some simple examples.
You can find the description for nn.CrossEntropyLoss() here.

To walk through your specific example, let us start by looking at the expected input dimension:

Input: (N,C) where C = number of classes. [...]

To add to this, N generally refers to the batch size (number of samples). To compare this to what you currently have:

outputs.shape
>>> torch.Size([2])

I.e. currently we only have an input dimension of (2,), and not (1,2), as is expected by PyTorch. We can alleviate this by adding a "fake" dimension to our current tensor, by simply using .unsqueeze() like so:

outputs = binary_model(tensor_input).unsqueeze(dim=0)
outputs.shape
>>> torch.Size([1,2])

Now that we got that, let us look at the expected input for the targets:

Target: (N) [...]

So we already got the right shape for this. If we try this, though, we will still encounter an error, though:

RuntimeError: Expected object of scalar type Long but got scalar type Float 
              for argument #2 'target'.

Again, the error message is rather expressive. The problem here is that PyTorch tensors (by default) are interpreted as torch.FloatTensors, but the input should be integers (or Long) instead. We can simply do this by specifying the exact type during tensor creations:

tensor_label = torch.LongTensor([dat[1]])

I'm using PyTorch 1.0 under Linux fyi.

like image 81
dennlinger Avatar answered Nov 15 '22 15:11

dennlinger


To perform a Logistic Regression in PyTorch you need 3 things:

  • Labels(targets) encoded as 0 or 1;
  • Sigmoid activation on last layer, so the num of outputs will be 1;
  • Binary Cross Entropy as Loss function.

Here is minimal example:

import torch
import torch.nn as nn


class LogisticRegression(nn.Module):
    def __init__(self, n_inputs, n_outputs):
        super(LogisticRegression, self).__init__()
        self.linear = nn.Linear(n_inputs, n_outputs)
        self.sigmoid = nn.Sigmoid()


    def forward(self, x):
        x = self.linear(x)
        return self.sigmoid(x)


# Init your model
# Attention!!! your num_output will be 1, because logistic function returns one value in range (0, 1) 
model = LogisticRegression(n_inputs=1, n_outputs=1)
# Define Binary Cross Entropy Loss:
criterion = nn.BCELoss()

# dummy data
data = (42.0, 0)
tensor_input = torch.Tensor([data[0]])
tensor_label = torch.Tensor([data[1]])

outputs = model(tensor_input)

loss = criterion(outputs, tensor_label)

print(loss.item())
like image 30
trsvchn Avatar answered Nov 15 '22 16:11

trsvchn