I've noticed that torch.device
can accept a range of arguments, precisely cpu
, cuda
, mkldnn
, opengl
, opencl
, ideep
, hip
, msnpu
.
However, when training deep learning models, I've only ever seen cuda
or cpu
being used. Very often the code looks something like this
if torch.cuda.is_available():
device = torch.device("cuda")
else:
device = torch.device("cpu")
I've never seen any of the others being used, and was wondering if they can be used and how. The latest MacBooks with an AMD graphic card I believe should be able to use "hip"
, but is that true? And will the training speed be similar to that of using one CUDA GPU? If not, what is the point in torch.device
accepting so many options if they cannot actually be used?
No CUDA. To install PyTorch via pip, and do not have a CUDA-capable or ROCm-capable system or do not require CUDA/ROCm (i.e. GPU support), in the above selector, choose OS: Linux, Package: Pip, Language: Python and Compute Platform: CPU. Then, run the command that is presented to you.
PyTorch v1. 12 introduces GPU-accelerated training on Apple silicon. It comes as a collaborative effort between PyTorch and the Metal engineering team at Apple. It uses Apple's Metal Performance Shaders (MPS) as the backend for PyTorch operations.
PyTorch provides a simple to use API to transfer the tensor generated on CPU to GPU. Luckily the new tensors are generated on the same device as the parent tensor. The same logic applies to the model. Thus data and the model need to be transferred to the GPU.
Single-Node Server Requirements Before you can run an AMD machine learning framework container, your Docker environment must support AMD GPUs. Note: The AMD PyTorch framework container assumes that the server contains the required x86-64 CPU(s) and at least one of the listed AMD GPUs.
If you want to use a GPU for deep learning there is selection between CUDA and CUDA...
More broad answer, yes there is AMD's hip and some OpenCL implementation:
Bottom line:
Edit 2021-09-14: there is a new project dlprimitives:
https://github.com/artyom-beilis/dlprimitives
that has better performance than both Caffe-OpenCL and Keras - it is ~75% performance for training in comparison to Keras/TF2, however it is under early development and has at this point much more limited set of layers that Caffe/Keras-PlaidML
The connection to pytorch is work in progress with some initial results: https://github.com/artyom-beilis/pytorch_dlprim
Disclaimer: I'm the author of this project
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With