I was wondering how to deal with in-place operations in PyTorch. As I remember using in-place operation with autograd has always been problematic. And actually I’m surprised that this code below works, even though I haven’t tested it I believe this code would have raised an error in version <code>0.3.1</code>. Basically I want do is set a certain position of a tensor vector to a certain value in a like: <pre class="prettyprint"><code>my_tensor[i] = 42 </code></pre> Working example code: <pre class="prettyprint"><code># test parameter a a = torch.rand((2), requires_grad=True) print('a ', a) b = torch.rand(2) # calculation c = a + b # performing in-place operation c[0] = 0 print('c ', c) s = torch.sum(c) print('s ', s) # calling backward() s.backward() # optimizer step optim = torch.optim.Adam(params=[a], lr=0.5) optim.step() # changed parameter a print('changed a', a) </code></pre> Output: <pre class="prettyprint"><code>a tensor([0.2441, 0.2589], requires_grad=True) c tensor([0.0000, 1.1511], grad_fn=<CopySlices>) s tensor(1.1511, grad_fn=<SumBackward0>) changed a tensor([ 0.2441, -0.2411], requires_grad=True) </code></pre> So obviously in version <code>0.4.1</code>. this works just fine without warnings or errors. Referring to this article in the documentation: autograd-mechanics <blockquote> Supporting in-place operations in autograd is a hard matter, and we discourage their use in most cases. Autograd’s aggressive buffer freeing and reuse makes it very efficient and there are very few occasions when in-place operations actually lower memory usage by any significant amount. Unless you’re operating under heavy memory pressure, you might never need to use them. </blockquote> But even though it works, the use of in-place operations is discouraged in most cases. <hr> So my questions are: <ul> <li>How much does the usage of in-place operations affect performance?</li> <li>How do I get around using in-place operations in such cases where I want to set one element of a tensor to a certain value?</li> </ul> Thanks in advance!

I am not sure about how much in-place operation affect performance but I can address the second query. You can use a mask instead of in-place ops. <pre class="prettyprint"><code>a = torch.rand((2), requires_grad=True) print('a ', a) b = torch.rand(2) # calculation c = a + b # performing in-place operation mask = np.zeros(2) mask[1] =1 mask = torch.tensor(mask) c = c*mask ... </code></pre>

This may be not a direct answer to your question, but just for information. In-place operations work for non-leaf tensors in a computational graph. Leaf tensors are tensors which are the 'ends' of a computational graph. Officially (from <code>is_leaf</code> attribute here), <blockquote> For Tensors that have requires_grad which is True, they will be leaf Tensors if they were created by the user. This means that they are not the result of an operation and so grad_fn is None. </blockquote> Example which works without error: <pre class="prettyprint"><code>a = torch.tensor([3.,2.,7.], requires_grad=True) print(a) # tensor([3., 2., 7.], requires_grad=True) b = a**2 print(b) # tensor([ 9., 4., 49.], grad_fn=<PowBackward0>) b[1] = 0 print(b) # tensor([ 9., 0., 49.], grad_fn=<CopySlices>) c = torch.sum(2*b) print(c) # tensor(116., grad_fn=<SumBackward0>) c.backward() print(a.grad) # tensor([12., 0., 28.]) </code></pre> On the other hand, in-place operations do not work for leaf tensors. Example which causes error: <pre class="prettyprint"><code>a = torch.tensor([3.,2.,7.], requires_grad=True) print(a) # tensor([3., 2., 7.], requires_grad=True) a[1] = 0 print(a) # tensor([3., 0., 7.], grad_fn=<CopySlices>) b = a**2 print(b) # tensor([ 9., 0., 49.], grad_fn=<PowBackward0>) c = torch.sum(2*b) print(c) # tensor(116., grad_fn=<SumBackward0>) c.backward() # Error occurs at this line. # RuntimeError: leaf variable has been moved into the graph interior </code></pre> I suppose that <code>b[1]=0</code> operation, in the first example above, is not really an in-place operation. I suppose that it creates a new tensor with "CopySlices" operation. The 'old b' before the in-place operation might be kept internally (just its name being overwritten by the 'new b'). I found a nice figure here. old b ---(CopySlices)----> new b On the other hand, the tensor <code>a</code> is a leaf tensor. After the CopySlices operation <code>a[1]=0</code>, it becomes an intermediate tensor. To avoid such a complicated mixture between leaf tensors and intermediate tensors when back propagating, CopySlices operation on leaf tensors is prohibited from coexisting with backward. This is merely my personal opinion, so please refer to official documents. Note: Although in-place operations work for intermediate tensors, it will be safe to use clone and detach as much as possible when you do some in-place operations, to explicitly create a new tensor which is independent of the computational graph.

In-place operations with PyTorch

Tags:

python

neural-network

deep-learning

pytorch

autograd

I was wondering how to deal with in-place operations in PyTorch. As I remember using in-place operation with autograd has always been problematic.

And actually I’m surprised that this code below works, even though I haven’t tested it I believe this code would have raised an error in version 0.3.1.

Basically I want do is set a certain position of a tensor vector to a certain value in a like:

my_tensor[i] = 42

Working example code:

# test parameter a
a = torch.rand((2), requires_grad=True)
print('a ', a)
b = torch.rand(2)

# calculation
c = a + b

# performing in-place operation
c[0] = 0
print('c ', c)
s = torch.sum(c)
print('s ', s)

# calling backward()
s.backward()

# optimizer step
optim = torch.optim.Adam(params=[a], lr=0.5)
optim.step()

# changed parameter a
print('changed a', a)

Output:

a  tensor([0.2441, 0.2589], requires_grad=True)
c  tensor([0.0000, 1.1511], grad_fn=<CopySlices>)
s  tensor(1.1511, grad_fn=<SumBackward0>)
changed a tensor([ 0.2441, -0.2411], requires_grad=True)

So obviously in version 0.4.1. this works just fine without warnings or errors.

Referring to this article in the documentation: autograd-mechanics

Supporting in-place operations in autograd is a hard matter, and we discourage their use in most cases. Autograd’s aggressive buffer freeing and reuse makes it very efficient and there are very few occasions when in-place operations actually lower memory usage by any significant amount. Unless you’re operating under heavy memory pressure, you might never need to use them.

But even though it works, the use of in-place operations is discouraged in most cases.

So my questions are:

How much does the usage of in-place operations affect performance?
How do I get around using in-place operations in such cases where I want to set one element of a tensor to a certain value?

Thanks in advance!

555

asked Aug 13 '18 08:08

MBT

2 Answers

I am not sure about how much in-place operation affect performance but I can address the second query. You can use a mask instead of in-place ops.

a = torch.rand((2), requires_grad=True)
print('a ', a)
b = torch.rand(2)

# calculation
c = a + b

# performing in-place operation
mask = np.zeros(2)
mask[1] =1
mask = torch.tensor(mask)
c = c*mask
...

109

answered Oct 10 '22 04:10

Umang Gupta

This may be not a direct answer to your question, but just for information.

In-place operations work for non-leaf tensors in a computational graph.

Leaf tensors are tensors which are the 'ends' of a computational graph. Officially (from is_leaf attribute here),

For Tensors that have requires_grad which is True, they will be leaf Tensors if they were created by the user. This means that they are not the result of an operation and so grad_fn is None.

Example which works without error:

a = torch.tensor([3.,2.,7.], requires_grad=True)
print(a)   # tensor([3., 2., 7.], requires_grad=True)
b = a**2
print(b)   # tensor([ 9.,  4., 49.], grad_fn=<PowBackward0>)
b[1] = 0
print(b)   # tensor([ 9.,  0., 49.], grad_fn=<CopySlices>)
c = torch.sum(2*b)
print(c)   # tensor(116., grad_fn=<SumBackward0>)
c.backward()
print(a.grad)  # tensor([12.,  0., 28.])

On the other hand, in-place operations do not work for leaf tensors.

Example which causes error:

a = torch.tensor([3.,2.,7.], requires_grad=True)
print(a) # tensor([3., 2., 7.], requires_grad=True)
a[1] = 0
print(a) # tensor([3., 0., 7.], grad_fn=<CopySlices>)
b = a**2
print(b) # tensor([ 9.,  0., 49.], grad_fn=<PowBackward0>)
c = torch.sum(2*b)
print(c) # tensor(116., grad_fn=<SumBackward0>)
c.backward()  # Error occurs at this line. 

# RuntimeError: leaf variable has been moved into the graph interior

I suppose that b[1]=0 operation, in the first example above, is not really an in-place operation. I suppose that it creates a new tensor with "CopySlices" operation. The 'old b' before the in-place operation might be kept internally (just its name being overwritten by the 'new b'). I found a nice figure here.

old b ---(CopySlices)----> new b

On the other hand, the tensor a is a leaf tensor. After the CopySlices operation a[1]=0, it becomes an intermediate tensor. To avoid such a complicated mixture between leaf tensors and intermediate tensors when back propagating, CopySlices operation on leaf tensors is prohibited from coexisting with backward.

This is merely my personal opinion, so please refer to official documents.

Note:

Although in-place operations work for intermediate tensors, it will be safe to use clone and detach as much as possible when you do some in-place operations, to explicitly create a new tensor which is independent of the computational graph.

answered Oct 10 '22 02:10

Toru Kikuchi

Related questions
                            
                                RabbitMQ non-blocking consumer
                            
                                Matplotlib plots lose transparency when saving as .pdf
                            
                                Creating wxSlider with range on Linux
                            
                                Matplotlib date ticker - exceeds Locator.MAXTICKS error for no apparent reason
                            
                                Scipy Optimizer constraint using two arguments
                            
                                How to debug dying Jupyter Python3 kernel?
                            
                                Imageio in python : compressing gif
                            
                                combining functools.lru_cache with multiprocessing.Pool
                            
                                tensorflow lite model gives very different accuracy value compared to python model
                            
                                Unexpected behavior in assigning 2d numpy array to pandas DataFrame
                            
                                How to inform class weights when using `tensorflow.python.keras.estimator.model_to_estimator` to convert Keras Models to Estimator API?
                            
                                How can we create data columns in Dash Table dynamically using callback with a function providing the dataframe
                            
                                Errors accessing .NET class method overloads in IronPython
                            
                                Python: Installing man pages in distutils based project
                            
                                Continuous mutual information in Python
                            
                                Convert builtin function type to method type (in Python 3)
                            
                                How do I output the regression prediction from each tree in a Random Forest in Python scikit-learn?
                            
                                python Spark avro
                            
                                Celery as networked pub/sub events
                            
                                Using a http proxy with headless firefox in Selenium webdriver in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With