Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

CUDA runtime error (59) : device-side assert triggered

THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1524584710464/work/aten/src/THC/generated/../generic/THCTensorMathPointwise.cu line=265 error=59 : device-side assert triggered
Traceback (most recent call last):
  File "main.py", line 109, in <module>
    train(loader_train, model, criterion, optimizer)
  File "main.py", line 54, in train
    optimizer.step()
  File "/usr/local/anaconda35/lib/python3.6/site-packages/torch/optim/sgd.py", line 93, in step
    d_p.add_(weight_decay, p.data)
RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1524584710464/work/aten/src/THC/generated/../generic/THCTensorMathPointwise.cu:265

How do I resolve this error?

like image 537
saichand Avatar asked Aug 05 '18 05:08

saichand


3 Answers

This is usually an indexing issue.

For example, if your ground truth label starts at 1:

target = [1,2,3,4,5]

Then you should subtract 1 for every label instead so that:

target = [0,1,2,3,4]
like image 118
Rainy Avatar answered Nov 12 '22 14:11

Rainy


In general, when encountering cuda runtine errors, it is advisable to run your program again using the CUDA_LAUNCH_BLOCKING=1 flag to obtain an accurate stack trace.

In your specific case, the targets of your data were too high (or low) for the specified number of classes.

like image 26
McLawrence Avatar answered Nov 12 '22 14:11

McLawrence


I encountered this error when running BertModel.from_pretrained('bert-base-uncased'). I found the solution by moving to the CPU when the error message changed to 'IndexError: index out of range in self'. Which led me to this post. The solution was to truncate sentences to length 512.

like image 12
R Tiffin Avatar answered Nov 12 '22 14:11

R Tiffin