Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does one use Pytorch (+ cuda) with an A100 GPU?

I was trying to use my current code with an A100 gpu but I get this error:

---> backend='nccl'
/home/miranda9/miniconda3/envs/metalearningpy1.7.1c10.2/lib/python3.8/site-packages/torch/cuda/__init__.py:104: UserWarning: 
A100-SXM4-40GB with CUDA capability sm_80 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_61 sm_70 sm_75 compute_37.
If you want to use the A100-SXM4-40GB GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

which is reather confusing because it points to the usual pytorch installation but doesn't tell me which combination of pytorch version + cuda version to use for my specific hardware (A100). What is the right way to install pytorch for an A100?


These are some versions I've tried:

# conda install -y pytorch==1.8.0 torchvision cudatoolkit=10.2 -c pytorch
# conda install -y pytorch torchvision cudatoolkit=10.2 -c pytorch
#conda install -y pytorch==1.7.1 torchvision torchaudio cudatoolkit=10.2 -c pytorch -c conda-forge
# conda install -y pytorch==1.6.0 torchvision cudatoolkit=10.2 -c pytorch
#conda install -y pytorch==1.7.1 torchvision torchaudio cudatoolkit=11.1 -c pytorch -c conda-forge

# conda install pytorch torchvision torchaudio cudatoolkit=11.0 -c pytorch
# conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch -c conda-forge
# conda install -y pytorch torchvision cudatoolkit=9.2 -c pytorch # For Nano, CC
# conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch -c conda-forge

note that this can be subtle because I've had this error with this machine + pytorch version in the past:

How to solve the famous `unhandled cuda error, NCCL version 2.7.8` error?

like image 601
Charlie Parker Avatar asked Apr 07 '21 19:04

Charlie Parker


2 Answers

From the link pytorch site from @SimonB 's answer, I did:

pip3 install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html

This solved the problem for me.

like image 83
James Hirschorn Avatar answered Sep 22 '22 19:09

James Hirschorn


I've got an A100 and have had success with

conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch -c nvidia

Which is now also recommended on the pytorch site

like image 25
Simon B Avatar answered Sep 22 '22 19:09

Simon B