Logo Questions Linux Laravel Mysql Ubuntu Git Menu

How to understand creating leaf tensors in PyTorch?

From PyTorch documentation:

b = torch.rand(10, requires_grad=True).cuda()
# b was created by the operation that cast a cpu Tensor into a cuda Tensor

e = torch.rand(10).cuda().requires_grad_()
# e requires gradients and has no operations creating it

f = torch.rand(10, requires_grad=True, device="cuda")
# f requires grad, has no operation creating it

But why are e and f leaf Tensors, when they both were also cast from a CPU Tensor, into a Cuda Tensor (an operation)?

Is it because Tensor e was cast into Cuda before the in-place operation requires_grad_()?

And because f was cast by assignment device="cuda" rather than by method .cuda()?

like image 921
Omar AlSuwaidi Avatar asked Dec 15 '20 07:12

Omar AlSuwaidi

1 Answers

When a tensor is first created, it becomes a leaf node.

Basically, all inputs and weights of a neural network are leaf nodes of the computational graph.

When any operation is performed on a tensor, it is not a leaf node anymore.

b = torch.rand(10, requires_grad=True) # create a leaf node
b.is_leaf # True
b = b.cuda() # perform a casting operation
b.is_leaf # False

requires_grad_() is not an operation in the same way as cuda() or others are.
It creates a new tensor, because tensor which requires gradient (trainable weight) cannot depend on anything else.

e = torch.rand(10) # create a leaf node
e.is_leaf # True
e = e.cuda() # perform a casting operation
e.is_leaf # False
e = e.requires_grad_() # this creates a NEW tensor
e.is_leaf # True

Also, detach() operation creates a new tensor which does not require gradient:

b = torch.rand(10, requires_grad=True)
b.is_leaf # True
b = b.detach()
b.is_leaf # True

In the last example we create a new tensor which is already on a cuda device.
We do not need any operation to cast it.

f = torch.rand(10, requires_grad=True, device="cuda") # create a leaf node on cuda
like image 199
hocop Avatar answered Oct 13 '22 10:10
