Image size of 256x256 (not 299x299) fed into Inception v3 model (PyTorch) and works?

Question

I am testing out the pretrained inception v3 model on Pytorch. I fed it an image size 256x256 and also resized it up to 299x299. In both cases, the image was classified correctly.

Can someone explain why the PyTorch pretrained model can accept an image that's not 299x299?

jodag · Accepted Answer

It's because the pytorch implementation of inception v3 uses an adaptive average pooling layer right before the fully-connected layer.

If you take a look at the Inception3 class in torchvision/models/inception.py, the operation of most interest with respect to your question is x = F.adaptive_avg_pool2d(x, (1, 1)). Since the average pooling is adaptive the height and width of x before pooling are independent of the output shape. In other words, after this operation we always get a tensor of size [b,c,1,1] where b and c are the batch size and number of channels respectively. This way the input to the fully connected layer is always the same size so no exceptions are raised.

That said, if you're using the pretrained inception v3 weights then the model was originally trained for input of size 299x299. Using inputs of different sizes may have a negative impact on loss/accuracy, although smaller input images will almost certainly decrease computational time and memory footprint since the feature maps will be smaller.

Image size of 256x256 (not 299x299) fed into Inception v3 model (PyTorch) and works?

Tags:

python

machine-learning

deep-learning

computer-vision

pytorch

JobHunter69

1 Answers

jodag

Recent Activity

Donate For Us

Image size of 256x256 (not 299x299) fed into Inception v3 model (PyTorch) and works?

Tags:

python

machine-learning

deep-learning

computer-vision

pytorch

JobHunter69

1 Answers

jodag

Related questions

Recent Activity

Donate For Us