See the code snippet: <pre class="prettyprint lang-py prettyprint-override"><code>import torch x = torch.tensor([-1.], requires_grad=True) y = torch.where(x > 0., x, torch.tensor([2.], requires_grad=True)) y.backward() print(x.grad) </code></pre> The output is <code>tensor([0.])</code>, but <pre class="prettyprint lang-py prettyprint-override"><code>import torch x = torch.tensor([-1.], requires_grad=True) if x > 0.: y = x else: y = torch.tensor([2.], requires_grad=True) y.backward() print(x.grad) </code></pre> The output is <code>None</code>. I'm confused that why the output of <code>torch.where</code> is <code>tensor([0.])</code>? <h3>update</h3> <pre class="prettyprint lang-py prettyprint-override"><code>import torch a = torch.tensor([[1,2.], [3., 4]]) b = torch.tensor([-1., -1], requires_grad=True) a[:,0] = b (a[0, 0] * a[0, 1]).backward() print(b.grad) </code></pre> The output is <code>tensor([2., 0.])</code>. <code>(a[0, 0] * a[0, 1])</code> is not in any way related to <code>b[1]</code>, but the gradient of <code>b[1]</code> is <code>0</code> not <code>None</code>.

Tracking based AD, like pytorch, works by tracking. You can't track through things that are not function calls intercepted by the library. By using an <code>if</code> statement like this, there's no connection between <code>x</code> and <code>y</code>, whereas with <code>where</code>, <code>x</code> and <code>y</code> are linked in the expression tree. Now, for the differences: <ul> <li>In the first snippet, <code>0</code> is the correct derivative of the function <code>x ↦ x > 0 ? x : 2</code> at the point <code>-1</code> (since the negative side is constant).</li> <li>In the second snippet, as I said, <code>x</code> is not in any way related to <code>y</code> (in the <code>else</code> branch). Therefore, the derivative of <code>y</code> given <code>x</code> is undefined, which is represented as <code>None</code>.</li> </ul> (You can do such things even in Python, but that requires more sophisticated technology like source transformation. I don't thing it is possible with pytorch.)

what is the difference between if-else statement and torch.where in pytorch?

Tags:

See the code snippet:

import torch
x = torch.tensor([-1.], requires_grad=True)
y = torch.where(x > 0., x, torch.tensor([2.], requires_grad=True))
y.backward()
print(x.grad)

The output is tensor([0.]), but

import torch
x = torch.tensor([-1.], requires_grad=True)
if x > 0.:
    y = x
else:
    y = torch.tensor([2.], requires_grad=True)
y.backward()
print(x.grad)

The output is None.

I'm confused that why the output of torch.where is tensor([0.])?

update

import torch
a = torch.tensor([[1,2.], [3., 4]])
b = torch.tensor([-1., -1], requires_grad=True)
a[:,0] = b

(a[0, 0] * a[0, 1]).backward()
print(b.grad)

The output is tensor([2., 0.]). (a[0, 0] * a[0, 1]) is not in any way related to b[1], but the gradient of b[1] is 0 not None.

731

asked Apr 13 '20 08:04

gaussclb

1 Answers

Tracking based AD, like pytorch, works by tracking. You can't track through things that are not function calls intercepted by the library. By using an if statement like this, there's no connection between x and y, whereas with where, x and y are linked in the expression tree.

Now, for the differences:

In the first snippet, 0 is the correct derivative of the function x ↦ x > 0 ? x : 2 at the point -1 (since the negative side is constant).
In the second snippet, as I said, x is not in any way related to y (in the else branch). Therefore, the derivative of y given x is undefined, which is represented as None.

(You can do such things even in Python, but that requires more sophisticated technology like source transformation. I don't thing it is possible with pytorch.)

answered Sep 16 '22 11:09

phipsgabler

Related questions
                            
                                Apache HttpClient failing with Java 11 on macOS
                            
                                Running async functions on Google Apps Script
                            
                                Swift Decodable parse part of JSON
                            
                                Recommended way to provide configuration files to Docker containers
                            
                                App removed from PlayStore because of library not in the project
                            
                                Problem with Tensorflow package when it's used by Lucid package
                            
                                List up my apps installable with brew-cask
                            
                                Unable to write the logs to file using Pino logger in NodeJS
                            
                                Typescript: Type 'X' provides no match for the signature '(prevState: undefined): undefined'
                            
                                Using typescript with useRoute on react navigation v5
                            
                                With ResNet50 the validation accuracy and loss is not changing
                            
                                Detect tapping the android back button to close the keyboard in flutter

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With