When I want to evaluate the performance of my model on the validation set, is it preferred to use <code>with torch.no_grad:</code> or <code>model.eval()</code>?

<code>with torch.no_grad:</code> disables computation of gradients for the backward pass. Since these calculations are unnecessary during inference, and add non-trivial computational overhead, it is essessential to use this context if evaluating the model's speed. It will not however affect results. <code>model.eval()</code> ensures certain modules which behave differently in training vs inference (e.g. Dropout and BatchNorm) are defined appropriately during the forward pass in inference. As such, if your model contains such modules it is essential to enable this. For the reasons above it is good practice to use both during inference.

Evaluating pytorch models: `with torch.no_grad` vs `model.eval()`

2 Answers

TL;DR:

Use both. They do different things, and have different scopes.

with torch.no_grad - disables tracking of gradients in autograd.
model.eval() changes the forward() behaviour of the module it is called upon
eg, it disables dropout and has batch norm use the entire population statistics

`with torch.no_grad`

The torch.autograd.no_grad documentation says:

Context-manager that disabled [sic] gradient calculation.

Disabling gradient calculation is useful for inference, when you are sure that you will not call Tensor.backward(). It will reduce memory consumption for computations that would otherwise have requires_grad=True. In this mode, the result of every computation will have requires_grad=False, even when the inputs have requires_grad=True.

`model.eval()`

The nn.Module.eval documentation says:

Sets the module in evaluation mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

The creator of pytorch said the documentation should be updated to suggest the usage of both, and I raised the pull request.

176

answered Oct 11 '22 06:10

Tom Hale

with torch.no_grad: disables computation of gradients for the backward pass. Since these calculations are unnecessary during inference, and add non-trivial computational overhead, it is essessential to use this context if evaluating the model's speed. It will not however affect results.

model.eval() ensures certain modules which behave differently in training vs inference (e.g. Dropout and BatchNorm) are defined appropriately during the forward pass in inference. As such, if your model contains such modules it is essential to enable this.

For the reasons above it is good practice to use both during inference.

answered Oct 11 '22 07:10

iacob

Related questions
                            
                                Do properties work on Django model fields?
                            
                                Converting Numpy Array to OpenCV Array
                            
                                Developing Python applications in Qt Creator
                            
                                Why can I pass an instance method to multiprocessing.Process, but not a multiprocessing.Pool?
                            
                                What is the meaning of a forward slash "/" in a Python method signature, as shown by help(foo)? [duplicate]
                            
                                Python: using multiprocessing on a pandas dataframe
                            
                                Python random sample with a generator / iterable / iterator
                            
                                How do I find which attributes my tree splits on, when using scikit-learn?
                            
                                Fix invalid polygon in Shapely
                            
                                Where do I put my python files in the venv folder?
                            
                                How do I point easy_install to vcvarsall.bat?
                            
                                Overload () operator in Python
                            
                                Python equivalent of zip for dictionaries
                            
                                How to use reveal_type in mypy
                            
                                Django, name parameter in urlpatterns
                            
                                What are the available datatypes for 'dtype' with NumPy's loadtxt() an genfromtxt?
                            
                                The Pythonic way of organizing modules and packages
                            
                                How does Python's cmp_to_key function work?
                            
                                AttributeError: 'tuple' object has no attribute
                            
                                How can I get descriptive statistics of a NumPy array?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Evaluating pytorch models: `with torch.no_grad` vs `model.eval()`

Tags:

python

machine-learning

deep-learning

pytorch

autograd

Tom Hale

People also ask

2 Answers

TL;DR:

`with torch.no_grad`

`model.eval()`

Tom Hale

iacob

Recent Activity

Donate For Us