I had a look at this tutorial in the PyTorch docs for understanding Transfer Learning. There was one line that I failed to understand. After the loss is calculated using <code>loss = criterion(outputs, labels)</code>, the running loss is calculated using <code>running_loss += loss.item() * inputs.size(0)</code> and finally, the epoch loss is calculated using <code>running_loss / dataset_sizes[phase]</code>. Isn't <code>loss.item()</code> supposed to be for an entire mini-batch (please correct me if I am wrong). i.e, if the <code>batch_size</code> is 4, <code>loss.item()</code> would give the loss for the entire set of 4 images. If this is true, why is <code>loss.item()</code> being multiplied with <code>inputs.size(0)</code> while calculating <code>running_loss</code>? Isn't this step like an extra multiplication in this case? Any help would be appreciated. Thanks!

It's because the loss given by <code>CrossEntropy</code> or other loss functions is divided by the number of elements i.e. the reduction parameter is <code>mean</code> by default. <blockquote> torch.nn.CrossEntropyLoss(weight=None, size_average=None, ignore_index=-100, reduce=None, reduction='mean') </blockquote> Hence, <code>loss.item()</code> contains the loss of entire mini-batch, but divided by the batch size. That's why <code>loss.item()</code> is multiplied with batch size, given by <code>inputs.size(0)</code>, while calculating <code>running_loss</code>.

What is running loss in PyTorch and how is it calculated

Tags:

python

deep-learning

pytorch

torch

torchvision

I had a look at this tutorial in the PyTorch docs for understanding Transfer Learning. There was one line that I failed to understand.

After the loss is calculated using loss = criterion(outputs, labels), the running loss is calculated using running_loss += loss.item() * inputs.size(0) and finally, the epoch loss is calculated using running_loss / dataset_sizes[phase].

Isn't loss.item() supposed to be for an entire mini-batch (please correct me if I am wrong). i.e, if the batch_size is 4, loss.item() would give the loss for the entire set of 4 images. If this is true, why is loss.item() being multiplied with inputs.size(0) while calculating running_loss? Isn't this step like an extra multiplication in this case?

Any help would be appreciated. Thanks!

672

asked Apr 08 '20 02:04

Jitesh Malipeddi

1 Answers

It's because the loss given by CrossEntropy or other loss functions is divided by the number of elements i.e. the reduction parameter is mean by default.

torch.nn.CrossEntropyLoss(weight=None, size_average=None, ignore_index=-100, reduce=None, reduction='mean')

Hence, loss.item() contains the loss of entire mini-batch, but divided by the batch size. That's why loss.item() is multiplied with batch size, given by inputs.size(0), while calculating running_loss.

answered Oct 17 '22 07:10

kHarshit

Related questions
                            
                                How to avoid python autopep8 formatting in a line in VSCode?
                            
                                Syntax to select previous row in pandas after filtering
                            
                                How can I download a specific part of Coco Dataset?
                            
                                Pipenv not recognizing Pyenv version?
                            
                                How can I encrypt with a RSA private key in python?
                            
                                How to count the number of columns with a value on each row in python?
                            
                                Is it possible to minify python code like javascript?
                            
                                "KeyError: 0" when trying to load a sequential model in Keras
                            
                                Google App Engine Python: Error in yaml config file when deploying
                            
                                TypeError: expected bytes-like object, not str
                            
                                Pandas - find specific value in entire dataframe
                            
                                Pyspark: Filter data frame if column contains string from another column (SQL LIKE statement)
                            
                                Open base64 String Image in Jupyter Notebook Without Saving
                            
                                How to check if all the elements in list are present in pandas column
                            
                                How to set the default figure size and DPI of all plots drawn by `matplotlib.pyplot`
                            
                                Django Python loaddata fails with django.db.utils.IntegrityError
                            
                                AttributeError: module 'camelot' has no attribute 'read_pdf'
                            
                                Finding mean and standard deviation across image channels PyTorch
                            
                                How to start a new Django project using poetry?
                            
                                Is there cudnnLSTM or cudNNGRU alternative in tensorflow 2.0

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With