I successfully trained the network but got this error during validation: <blockquote> RuntimeError: CUDA error: out of memory </blockquote>

The error occurs because you ran out of memory on your GPU. One way to solve it is to reduce the batch size until your code runs without this error.

1.. When you only perform validation not training, you don't need to calculate gradients for forward and backward phase. In that situation, your code can be located under <pre class="prettyprint"><code>with torch.no_grad(): ... net=Net() pred_for_validation=net(input) ... </code></pre> Above code doesn't use GPU memory 2.. If you use += operator in your code, it can accumulate gradient continuously in your gradient graph. In that case, you need to use float() like following site https://pytorch.org/docs/stable/notes/faq.html#my-model-reports-cuda-runtime-error-2-out-of-memory Even if docs guides with float(), in case of me, item() also worked like <pre class="prettyprint"><code>entire_loss=0.0 for i in range(100): one_loss=loss_function(prediction,label) entire_loss+=one_loss.item() </code></pre> 3.. If you use for loop in training code, data can be sustained until entire for loop ends. So, in that case, you can explicitly delete variables after performing optimizer.step() <pre class="prettyprint"><code>for one_epoch in range(100): ... optimizer.step() del intermediate_variable1,intermediate_variable2,... </code></pre>

How to fix this strange error: "RuntimeError: CUDA error: out of memory"

2 Answers

The error occurs because you ran out of memory on your GPU.

One way to solve it is to reduce the batch size until your code runs without this error.

143

answered Oct 16 '22 12:10

K. Khanda

1.. When you only perform validation not training,
you don't need to calculate gradients for forward and backward phase.
In that situation, your code can be located under

with torch.no_grad():
    ...
    net=Net()
    pred_for_validation=net(input)
    ...

Above code doesn't use GPU memory

2.. If you use += operator in your code,
it can accumulate gradient continuously in your gradient graph.
In that case, you need to use float() like following site
https://pytorch.org/docs/stable/notes/faq.html#my-model-reports-cuda-runtime-error-2-out-of-memory

Even if docs guides with float(), in case of me, item() also worked like

entire_loss=0.0
for i in range(100):
    one_loss=loss_function(prediction,label)
    entire_loss+=one_loss.item()

3.. If you use for loop in training code,
data can be sustained until entire for loop ends.
So, in that case, you can explicitly delete variables after performing optimizer.step()

for one_epoch in range(100):
    ...
    optimizer.step()
    del intermediate_variable1,intermediate_variable2,...

answered Oct 16 '22 11:10

YoungMin Park

Related questions
                            
                                AttributeError: 'DataFrame' object has no attribute 'map'
                            
                                Recursively convert python object graph to dictionary
                            
                                Make a custom loss function in keras
                            
                                Convert hex to float
                            
                                Download a remote image and save it to a Django model
                            
                                Serialize queryset in Django rest framework
                            
                                Changing LD_LIBRARY_PATH at runtime for ctypes
                            
                                How do I run python 2 and 3 in windows 7? [duplicate]
                            
                                For Django models, is there a shortcut for seeing if a record exists?
                            
                                Pipe output from shell command to a python script
                            
                                Identify duplicate values in a list in Python
                            
                                GroupBy results to dictionary of lists
                            
                                Creating Pandas Dataframe between two Numpy arrays, then draw scatter plot
                            
                                When to use and when not to use Python 3.5 `await` ?
                            
                                SMTP AUTH extension not supported by server
                            
                                Passing objects from Django to Javascript DOM
                            
                                Daemon vs Upstart for python script
                            
                                How to plot bar graphs with same X coordinates side by side ('dodged')
                            
                                unique plot marker for each plot in matplotlib
                            
                                Combine two Pandas dataframes with the same index [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to fix this strange error: "RuntimeError: CUDA error: out of memory"

Tags:

python

pycharm

pytorch

xiaoding chen

People also ask

2 Answers

K. Khanda

YoungMin Park

Recent Activity

Donate For Us