Different `grad_fn` for similar looking operations in Pytorch (1.0)

Tags:

I am working on an attention model, and before running the final model, I was going through the tensor shapes which flow through the code. I have an operation where I need to reshape the tensor. The tensor is of the shape torch.Size([[30, 8, 9, 64]]) where 30 is the batch_size, 8 is the number of attention head (this is not relevant to my question) 9 is the number of words in the sentence and 64 is some intermediate embedding representation of the word. I have to reshape the tensor to a size of torch.size([30, 9, 512]) before processing it further. So I was looking into some reference online and they have done the following x.transpose(1, 2).contiguous().view(30, -1, 512) whereas I was thinking that this should work x.transpose(1, 2).reshape(30, -1, 512).

In the first case the grad_fn is <ViewBackward>, whereas in my case it is <UnsafeViewBackward>. Aren't these two the same operations? Will this result in a training error?

398

asked Apr 24 '19 17:04

abkds

Video Answer

1 Answers

Aren't these two the same operations?

No. While they produce effectively the same tensor, the operations are not the same, and they are not guaranteed to have the same storage.

TensorShape.cpp:

// _unsafe_view() differs from view() in that the returned tensor isn't treated
// as a view for the purposes of automatic differentiation. (It's not listed in
// VIEW_FUNCTIONS in gen_autograd.py).  It's only safe to use if the `self` tensor
// is temporary. For example, the viewed tensor here (a + b) is discarded immediately
// after viewing:
//
//  res = at::_unsafe_view(a + b, size);
//
// This is a hack because in-place operations on tensors treated like views
// can be much more expensive than the same operations on non-view tensors.

Note this can produce an error if applied to complex inputs, but this is generally not yet fully supported in PyTorch and not unique to this function.

108

answered Oct 30 '22 21:10

iacob

Related questions
                            
                                How to setup vscode Python debugger for an app engine app?
                            
                                How does tf.layers.dense() interact with inputs of higher dim?
                            
                                The uninstall.dat file cannot be found in postgreSQL
                            
                                how to add transfer syntax uid to the filemeta of dataset
                            
                                Django Postgres Connection Pooling
                            
                                Group several columns then aggregate a set of columns in Pandas (It crashes badly compared to R's data.table)
                            
                                Keras: update model with a bigger training set
                            
                                unable to update scikit-learn to version 0.20
                            
                                How do I find all available locales in Python
                            
                                How can I make pylint and autopep8 agree on how to indent wrapped function definitions?
                            
                                Google Colab Error : Failed to get convolution algorithm.This is probably because cuDNN failed to initialize
                            
                                How can I use a parametrized dependent fixture twice in pytest?
                            
                                How to understand head pose estimation angles in Python with OpenCV?
                            
                                PyQt keep aspect ratio fixed
                            
                                How to pin pipenv requirements with brackets?
                            
                                Prevent script dir from being added to sys.path in Python 3
                            
                                How should I type-hint an integer variable that can also be infinite?
                            
                                pandas.read_csv() can apply different date formats within the same column! Is it a known bug? How to fix it?
                            
                                dtreeviz: from graphviz.backend cannot import name 'run'
                            
                                Deploy Django Channels with Docker

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Different `grad_fn` for similar looking operations in Pytorch (1.0)

Tags:

python

pytorch

attention-model

abkds

People also ask

Video Answer

1 Answers

iacob

Recent Activity

Donate For Us