Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PyTorch - What does contiguous() do?

People also ask

What is a non contiguous tensor?

A tensor can be viewed with different dimensions in contiguous manner. A transpose of a tensor creates a view of the original tensor which follows non-contiguous order. The transpose of a tensor is non-contiguous.

What does .view in PyTorch do?

PyTorch allows a tensor to be a View of an existing tensor. View tensor shares the same underlying data with its base tensor. Supporting View avoids explicit data copy, thus allows us to do fast and memory efficient reshaping, slicing and element-wise operations.

How do I flatten in PyTorch?

flatten. Flattens input by reshaping it into a one-dimensional tensor. If start_dim or end_dim are passed, only dimensions starting with start_dim and ending with end_dim are flattened.

What does view () do in Python?

A view function, or view for short, is a Python function that takes a web request and returns a web response. This response can be the HTML contents of a web page, or a redirect, or a 404 error, or an XML document, or an image . . . or anything, really.


There are a few operations on Tensors in PyTorch that do not change the contents of a tensor, but change the way the data is organized. These operations include:

narrow(), view(), expand() and transpose()

For example: when you call transpose(), PyTorch doesn't generate a new tensor with a new layout, it just modifies meta information in the Tensor object so that the offset and stride describe the desired new shape. In this example, the transposed tensor and original tensor share the same memory:

x = torch.randn(3,2)
y = torch.transpose(x, 0, 1)
x[0, 0] = 42
print(y[0,0])
# prints 42

This is where the concept of contiguous comes in. In the example above, x is contiguous but y is not because its memory layout is different to that of a tensor of same shape made from scratch. Note that the word "contiguous" is a bit misleading because it's not that the content of the tensor is spread out around disconnected blocks of memory. Here bytes are still allocated in one block of memory but the order of the elements is different!

When you call contiguous(), it actually makes a copy of the tensor such that the order of its elements in memory is the same as if it had been created from scratch with the same data.

Normally you don't need to worry about this. You're generally safe to assume everything will work, and wait until you get a RuntimeError: input is not contiguous where PyTorch expects a contiguous tensor to add a call to contiguous().


From the pytorch documentation:

contiguous() → Tensor
Returns a contiguous tensor containing the same data as self tensor. If self tensor is contiguous, this function returns the self tensor.

Where contiguous here means not only contiguous in memory, but also in the same order in memory as the indices order: for example doing a transposition doesn't change the data in memory, it simply changes the map from indices to memory pointers, if you then apply contiguous() it will change the data in memory so that the map from indices to memory location is the canonical one.


tensor.contiguous() will create a copy of the tensor, and the element in the copy will be stored in the memory in a contiguous way. The contiguous() function is usually required when we first transpose() a tensor and then reshape (view) it. First, let's create a contiguous tensor:

aaa = torch.Tensor( [[1,2,3],[4,5,6]] )
print(aaa.stride())
print(aaa.is_contiguous())
#(3,1)
#True

The stride() return (3,1) means that: when moving along the first dimension by each step (row by row), we need to move 3 steps in the memory. When moving along the second dimension (column by column), we need to move 1 step in the memory. This indicates that the elements in the tensor are stored contiguously.

Now we try apply come functions to the tensor:

bbb = aaa.transpose(0,1)
print(bbb.stride())
print(bbb.is_contiguous())

#(1, 3)
#False


ccc = aaa.narrow(1,1,2)   ## equivalent to matrix slicing aaa[:,1:3]
print(ccc.stride())
print(ccc.is_contiguous())

#(3, 1)
#False


ddd = aaa.repeat(2,1)   # The first dimension repeat once, the second dimension repeat twice
print(ddd.stride())
print(ddd.is_contiguous())

#(3, 1)
#True


## expand is different from repeat.
## if a tensor has a shape [d1,d2,1], it can only be expanded using "expand(d1,d2,d3)", which
## means the singleton dimension is repeated d3 times
eee = aaa.unsqueeze(2).expand(2,3,3)
print(eee.stride())
print(eee.is_contiguous())

#(3, 1, 0)
#False


fff = aaa.unsqueeze(2).repeat(1,1,8).view(2,-1,2)
print(fff.stride())
print(fff.is_contiguous())

#(24, 2, 1)
#True

Ok, we can find that transpose(), narrow() and tensor slicing, and expand() will make the generated tensor not contiguous. Interestingly, repeat() and view() does not make it discontiguous. So now the question is: what happens if I use a discontiguous tensor?

The answer is it the view() function cannot be applied to a discontiguous tensor. This is probably because view() requires that the tensor to be contiguously stored so that it can do fast reshape in memory. e.g:

bbb.view(-1,3)

we will get the error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-63-eec5319b0ac5> in <module>()
----> 1 bbb.view(-1,3)

RuntimeError: invalid argument 2: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Call .contiguous() before .view(). at /pytorch/aten/src/TH/generic/THTensor.cpp:203

To solve this, simply add contiguous() to a discontiguous tensor, to create contiguous copy and then apply view()

bbb.contiguous().view(-1,3)
#tensor([[1., 4., 2.],
        [5., 3., 6.]])

As in the previous answer contigous() allocates contigous memory chunks, it'll be helpful when we're passing tensor to c or c++ backend code where tensors are passed as pointers


The accepted answers were so great, and I tried to dupe the transpose() function effect. I created the two functions that can check the samestorage() and the contiguous.

def samestorage(x,y):
    if x.storage().data_ptr()==y.storage().data_ptr():
        print("same storage")
    else:
        print("different storage")
def contiguous(y):
    if True==y.is_contiguous():
        print("contiguous")
    else:
        print("non contiguous")

I checked and got this result as a table:

functions

You can review the checker code down below, but let's give one example when the tensor is non contiguous. We cannot simply call view() on that tensor, we would need to reshape() it or we could also call .contiguous().view().

x = torch.randn(3,2)
y = x.transpose(0, 1)
y.view(6) # RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
  
x = torch.randn(3,2)
y = x.transpose(0, 1)
y.reshape(6)

x = torch.randn(3,2)
y = x.transpose(0, 1)
y.contiguous().view(6)

Further to note there are methods that create contiguous and non contiguous tensors in the end. There are methods that can operate on a same storage, and some methods as flip() that will create a new storage (read: clone the tensor) before returning.

The checker code:

import torch
x = torch.randn(3,2)
y = x.transpose(0, 1) # flips two axes
print("\ntranspose")
print(x)
print(y)
contiguous(y)
samestorage(x,y)

print("\nnarrow")
x = torch.randn(3,2)
y = x.narrow(0, 1, 2) #dim, start, len  
print(x)
print(y)
contiguous(y)
samestorage(x,y)

print("\npermute")
x = torch.randn(3,2)
y = x.permute(1, 0) # sets the axis order
print(x)
print(y)
contiguous(y)
samestorage(x,y)

print("\nview")
x = torch.randn(3,2)
y=x.view(2,3)
print(x)
print(y)
contiguous(y)
samestorage(x,y)

print("\nreshape")
x = torch.randn(3,2)
y = x.reshape(6,1)
print(x)
print(y)
contiguous(y)
samestorage(x,y)

print("\nflip")
x = torch.randn(3,2)
y = x.flip(0)
print(x)
print(y)
contiguous(y)
samestorage(x,y)

print("\nexpand")
x = torch.randn(3,2)
y = x.expand(2,-1,-1)
print(x)
print(y)
contiguous(y)
samestorage(x,y)

A one-dimensional array [0, 1, 2, 3, 4] is contiguous if its items are laid out in memory next to each other just like below:

contiguous memory allocation

It is not contiguous if the region of memory where it is stored looks like this:

non contiguous allocation

For 2-dimensional arrays or more, items must also be next to each other, but the order follow different conventions. Let's consider the 2D-array below:

>>> t = torch.tensor([[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]])

two dimensional array

The memory allocation is C contiguous if the rows are stored next to each other like this:

two dimensional memory

This is what Pytorch considers contiguous.

>>> t.is_contiguous()
True

The stride attribute associated with the array gives the number of bytes to skip to get the next element in each dimension

>>> t.stride()
(4, 1)

We need to skip 4 bytes to go to the next line, but only one byte to go to the next element in the same line.

As said in other answers, some Pytorch operations do not change the memory allocation, only metadata.

For instance the transpose method. Let's transpose the tensor:

two dimensional array

The memory allocation didn't change:

two dimensional memory non contiguous

But the stride did:

>>> t.T.stride()
(1, 4)

We need to skip 1 byte to go to the next line and 4 bytes to go to the next element in the same line. The tensor is not C contiguous anymore (it is in fact Fortran contiguous: each column is stored next to each other)

>>> t.T.is_contiguous()
False

contiguous() will rearrange the memory allocation so that the tensor is C contiguous:

two dimensional memory contiguous

>>> t.T.contiguous().stride()
(3, 1)