Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PyTorch Forward Pass with CUDA Tensor and CPU Tensor while Retaining Gradients

I have a vector a produced by neural model which need to interact with a huge matrix M. Since M is large, I have to do the computation in cpu device. In this case, I wonder if the gradient can be retained and backwarded on cuda device.

Example as below:

a_cuda = torch.randn([1, 512], requires_grad=True).to("cuda")
a_cpu = torch.randn([1, 512], requires_grad=True).to("cpu")

M = torch.randn([512, 100000], requires_grad=False) # loaded in cpu device, dont need update

out_cuda = (a_cuda.cpu() @ M).sum()
out_cuda.backward()

out_cpu = (a_cpu @ M).sum()
out_cpu.backward()

print(a_cuda.grad) # None
print(a_cpu.grad)

I am looking for solution such that a_cuda.grad has the same gradients as a_cpu.grad.

like image 744
Garvey Avatar asked Nov 27 '25 13:11

Garvey


1 Answers

Sure.

Try this:

a_cuda = torch.randn([1, 512], requires_grad=True, device='cuda')
a_cpu = torch.randn([1, 512], requires_grad=True)

M = torch.randn([512, 100000], requires_grad=False) # loaded in cpu device, dont need update

out_cuda = (a_cuda.cpu() @ M).sum()
out_cuda.backward()

out_cpu = (a_cpu @ M).sum()
out_cpu.backward()

print(a_cuda.grad)
print(a_cpu.grad)
like image 140
Yakov Dan Avatar answered Nov 30 '25 03:11

Yakov Dan



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!