Why PyTorch model takes multiple image size inside the model?

Tags:

I am using a simple object detection model in PyTorch and using a Pytoch Model for Inferencing.

When I am using a simple iterator over the code

for k, image_path in enumerate(image_list):
    image = imgproc.loadImage(image_path)
    print(image.shape)
    with torch.no_grad():
        y, feature = net(x)        
    result = image.cuda()

It prints our variable sized images such as

torch.Size([1, 3, 384, 320])

torch.Size([1, 3, 704, 1024])

torch.Size([1, 3, 1280, 1280])

So When I am using Batch Inferencing using a DataLoader applying the same transformation the code is not running. However, when I am resizing all the images as 600.600 the batch processing runs successfully.

I am having Two Doubts,

First why Pytorch is capable of inputting dynamically sized inputs in Deep Learning Model and Why dynamic sized input is failing in Batch Processing.

982

asked Jul 03 '20 16:07

Abhik Sarkar

1 Answers

PyTorch has what is called a Dynamic Computational Graph (other explanation).

It allows the graph of the neural network to dynamically adapt to its input size, from one input to the next, during training or inference. This is what you observe in your first example: providing an image as a Tensor of size [1, 3, 384, 320] to your model, then another one as a Tensor of size [1, 3, 384, 1024], and so forth, is completely fine, as, for each input, your model will dynamically adapt.

However, if your input is a actually a collection of inputs (a batch), it is another story. A batch, for PyTorch, will be transformed to a single Tensor input with one extra dimension. For example, if you provide a list of n images, each of the size [1, 3, 384, 320], PyTorch will stack them, so that your model has a single Tensor input, of the shape [n, 1, 3, 384, 320].

This "stacking" can only happen between images of the same shape. To provide a more "intuitive" explanation than previous answers, this stacking operation cannot be done between images of different shapes, because the network cannot "guess" how the different images should "align" with one another in a batch, if they are not all the same size.

No matter if it happens during training or testing, if you create a batch out of images of varying size, PyTorch will refuse your input.

Several solutions are usually in use: reshaping as you did, adding padding (often small or null values on the border of your images) to extend your smaller images to the size of the biggest one, and so forth.

answered Sep 27 '22 15:09

clef

Related questions
                            
                                How do you use EC.presence_of_element_located((By.ID, "myDynamicElement")) except to specify class not ID
                            
                                Vectorizing a "pure" function with numpy, assuming many duplicates
                            
                                Visualising the decision tree in sklearn
                            
                                How change Schemes from HTTP to HTTPS in drf_yasg?
                            
                                Time complexity: deleting element of deque
                            
                                Explanding GeoPandas Multipolygon Dataframe To One Poly Per Line
                            
                                split rows in pandas dataframe
                            
                                How to concatenate a list with a nested list?
                            
                                Unpack value(s) into variable(s) or None (ValueError: not enough values to unpack) [duplicate]
                            
                                Achieving multiple inheritance using python dataclasses
                            
                                How to throw HTTP error code with AWS Lambda using Lambda Proxy?
                            
                                Python3 : module 'tabula' has no attribute 'read_pdf'
                            
                                How do you model something-over-time in Python?
                            
                                Unable to import pandas (pandas._libs.window.aggregations)
                            
                                Pyenv's python is missing bzip2 module
                            
                                Plotly: Figure window doesn't appear using Spyder
                            
                                Unavailable to install Tensorflow 1.x on Ubuntu 20.04 LTS using pip
                            
                                Renaming months from number to name in pandas
                            
                                What's the best way to parse through a list of strings and return joined strings based on slices of these strings?
                            
                                Google translate api timeout

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why PyTorch model takes multiple image size inside the model?

Tags:

python

machine-learning

deep-learning

computer-vision

pytorch

Abhik Sarkar

People also ask

1 Answers

clef

Recent Activity

Donate For Us