PyTorch custom loss function

Tags:

How should a custom loss function be implemented ? Using below code is causing error :

import torch import torch.nn as nn import torchvision import torchvision.transforms as transforms import numpy as np import matplotlib.pyplot as plt import torch.utils.data as data_utils import torch.nn as nn import torch.nn.functional as F  num_epochs = 20  x1 = np.array([0,0]) x2 = np.array([0,1]) x3 = np.array([1,0]) x4 = np.array([1,1])  num_epochs = 200  class cus2(torch.nn.Module):          def __init__(self):         super(cus2,self).__init__()          def forward(self, outputs, labels):         # reshape labels to give a flat vector of length batch_size*seq_len         labels = labels.view(-1)            # mask out 'PAD' tokens         mask = (labels >= 0).float()          # the number of tokens is the sum of elements in mask         num_tokens = int(torch.sum(mask).data[0])          # pick the values corresponding to labels and multiply by mask         outputs = outputs[range(outputs.shape[0]), labels]*mask          # cross entropy loss for all non 'PAD' tokens         return -torch.sum(outputs)/num_tokens   x = torch.tensor([x1,x2,x3,x4]).float()  y = torch.tensor([0,1,1,0]).long()  train = data_utils.TensorDataset(x,y) train_loader = data_utils.DataLoader(train , batch_size=2 , shuffle=True)  device = 'cpu'  input_size = 2 hidden_size = 100  num_classes = 2  learning_rate = .0001  class NeuralNet(nn.Module) :      def __init__(self, input_size, hidden_size, num_classes) :          super(NeuralNet, self).__init__()         self.fc1 = nn.Linear(input_size , hidden_size)         self.relu = nn.ReLU()         self.fc2 = nn.Linear(hidden_size , num_classes)      def forward(self, x) :          out = self.fc1(x)         out = self.relu(out)         out = self.fc2(out)         return out          for i in range(0 , 1) :                  model = NeuralNet(input_size, hidden_size, num_classes).to(device)                  criterion = nn.CrossEntropyLoss() #         criterion = Regress_Loss() #         criterion = cus2()         optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)                  total_step = len(train_loader)         for epoch in range(num_epochs) :              for i,(images , labels) in enumerate(train_loader) :                  images = images.reshape(-1 , 2).to(device)                 labels = labels.to(device)                                  outputs = model(images)                 loss = criterion(outputs , labels)                                  optimizer.zero_grad()                 loss.backward()                 optimizer.step() #                 print(loss)                          outputs = model(x)                  print(outputs.data.max(1)[1])

makes perfect predictions on training data :

tensor([0, 1, 1, 0])

Using a custom loss function from here:

image of the code used for the cus2 class

is implemented in above code as cus2

Un-commenting code # criterion = cus2() to use this loss function returns :

tensor([0, 0, 0, 0])

A warning is also returned :

UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number

I've not implemented the custom loss function correctly ?

998

asked Dec 30 '18 17:12

blue-sky

1 Answers

Your loss function is programmatically correct except for below:

    # the number of tokens is the sum of elements in mask     num_tokens = int(torch.sum(mask).data[0])

When you do torch.sum it returns a 0-dimensional tensor and hence the warning that it can't be indexed. To fix this do int(torch.sum(mask).item()) as suggested or int(torch.sum(mask)) will work too.

Now, are you trying to emulate the CE loss using the custom loss? If yes, then you are missing the log_softmax

To fix that add outputs = torch.nn.functional.log_softmax(outputs, dim=1) before statement 4. Note that in case of tutorial that you have attached, log_softmax is already done in the forward call. You can do that too.

Also, I noticed that the learning rate is slow and even with CE loss, results are not consistent. Increasing the learning rate to 1e-3 works well for me in case of custom as well as CE loss.

answered Oct 06 '22 08:10

Umang Gupta

Related questions
                            
                                Defining the inverse of Partial<T> in TypeScript
                            
                                Is the .should('exist') assertion redundant on Cypress?
                            
                                How to wait for MSSQL in Docker Compose?
                            
                                IntStream leads to array elements being wrongly set to 0 (JVM Bug, Java 11)
                            
                                Postman error: "Unable to verify the first certificate" when try to get from my .net core api
                            
                                Can I configure Visual Studio NOT to change StartUp Project every time I open a file from one of the projects?
                            
                                Examples of using semantic web technologies in real world applications [closed]
                            
                                How do you manage a large product backlog? [closed]
                            
                                Is there any LaTeX package for drawing Gantt diagrams?
                            
                                Where is the official, extensive, complete documentation on web.config? [closed]
                            
                                Is there a way to check if an istream was opened in binary mode?
                            
                                Using SMO to copy a database and data

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With