We can allocate a tensor on GPU using torch.Tensor([1., 2.], device='cuda')
. Are there any differences using that way rather than torch.cuda.Tensor([1., 2.])
, except we can pass in a specific CUDA device to the former one?
Or in other words, in which scenario is torch.cuda.Tensor()
necessary?
torch. tensor infers the dtype automatically, while torch. Tensor returns a torch.
torch.cuda is used to set up and run CUDA operations. It keeps track of the currently selected GPU, and all CUDA tensors you allocate will by default be created on that device. The selected device can be changed with a torch.cuda.device context manager.
Tensor cores are programmable using NVIDIA libraries and directly in CUDA C++ code. A defining feature of the new Volta GPU Architecture is its Tensor Cores, which give the Tesla V100 accelerator a peak throughput 12 times the 32-bit floating point throughput of the previous-generation Tesla P100.
cuda() and to('cuda') are going to do the same thing, but the later is more flexible. As you can see in your example code, you can specify a device that might be 'cpu' if cuda is unavailable.
So generally both torch.Tensor
and torch.cuda.Tensor
are equivalent. You can do everything you like with them both.
The key difference is just that torch.Tensor
occupies CPU memory while torch.cuda.Tensor
occupies GPU memory. Of course operations on a CPU Tensor are computed with CPU while operations for the GPU / CUDA Tensor are computed on GPU.
The reason you need these two tensor types is that the underlying hardware interface is completely different. Apart from the point it doesn't make sense computationally, you will get an error as soon as you try to do computations between torch.Tensor
and torch.cuda.Tensor
:
import torch
# device will be 'cuda' if a GPU is available
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
# creating a CPU tensor
cpu_tensor = torch.rand(10)
# moving same tensor to GPU
gpu_tensor = cpu_tensor.to(device)
print(cpu_tensor, cpu_tensor.dtype, type(cpu_tensor), cpu_tensor.type())
print(gpu_tensor, gpu_tensor.dtype, type(gpu_tensor), gpu_tensor.type())
print(cpu_tensor*gpu_tensor)
Output:
tensor([0.8571, 0.9171, 0.6626, 0.8086, 0.6440, 0.3682, 0.9920, 0.4298, 0.0172,
0.1619]) torch.float32 <class 'torch.Tensor'> torch.FloatTensor
tensor([0.8571, 0.9171, 0.6626, 0.8086, 0.6440, 0.3682, 0.9920, 0.4298, 0.0172,
0.1619], device='cuda:0') torch.float32 <class 'torch.Tensor'> torch.cuda.FloatTensor
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-15-ac794171c178> in <module>()
12 print(gpu_tensor, gpu_tensor.dtype, type(gpu_tensor), gpu_tensor.type())
13
---> 14 print(cpu_tensor*gpu_tensor)
RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #2 'other'
As the underlying hardware interface is completely different, CPU Tensors are just compatible with CPU Tensor and verse visa GPU Tensors are just compatible to GPU Tensors.
Edit:
As you can see here that a tensor which is moved to GPU is actually a tensor of type: torch.cuda.*Tensor
i.e. torch.cuda.FloatTensor
.
So cpu_tensor.to(device)
or torch.Tensor([1., 2.], device='cuda')
will actually return a tensor of type torch.cuda.FloatTensor
.
In which scenario is torch.cuda.Tensor()
necessary?
When you want to use GPU acceleration (which is much faster in most cases) for your program, you need to use torch.cuda.Tensor
, but you have to make sure that ALL tensors you are using are CUDA Tensors, mixing is not possible here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With