I have a input tensor of size [1,32,296,400]
and I have a pixel set of [1, 56000, 400, 2]
After applying grid_sample with mode=‘bilinear’
I have [1, 32, 56000, 400]
Can I know what exactly happened here? I know that grid_sample
is suppose to effective transform pixels to a new location in a differentiable manner, but these dimensions don't make it clear what is happening.
Please look at the documentation of grid_sample
.
Your input tensor has a shape
of 1x32x296x400, that is, you have a single example in the batch with 32 channels and spatial dimensions of 296x400 pixels.
Additionally, you have a "grid" of size 1x56000x400x2 which pytorch interpret as new locations for a grid of spatial dimensions of 56000x400 where each new location has the x,y coordinates from which to sample the new grid value. Hence the "grid" information is of shape 1x56000x400x2.
The output is, as expected, a 2D tensor of shape 1x32x56000x400: batch and channel dimensions are unchanged but the spatial coordinates are in accordance with the "grid" information provided to grid_sample
.
If your domain is images, we can give the dimensions more intuitive names.
Your input tensor is a batch b
of images with c
channels and h
height and w
width.
Your grid is a tensor of batch b
operations defining h
height and w
width pixels, and in which xy
locations from the input should be sampled.
input [b, c, h, w]
grid [b, h, w, xy]
out [b, c, h, w]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With