Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Image size of 256x256 (not 299x299) fed into Inception v3 model (PyTorch) and works?

I am testing out the pretrained inception v3 model on Pytorch. I fed it an image size 256x256 and also resized it up to 299x299. In both cases, the image was classified correctly.

Can someone explain why the PyTorch pretrained model can accept an image that's not 299x299?

like image 864
JobHunter69 Avatar asked Dec 23 '22 22:12

JobHunter69


1 Answers

It's because the pytorch implementation of inception v3 uses an adaptive average pooling layer right before the fully-connected layer.

If you take a look at the Inception3 class in torchvision/models/inception.py, the operation of most interest with respect to your question is x = F.adaptive_avg_pool2d(x, (1, 1)). Since the average pooling is adaptive the height and width of x before pooling are independent of the output shape. In other words, after this operation we always get a tensor of size [b,c,1,1] where b and c are the batch size and number of channels respectively. This way the input to the fully connected layer is always the same size so no exceptions are raised.

That said, if you're using the pretrained inception v3 weights then the model was originally trained for input of size 299x299. Using inputs of different sizes may have a negative impact on loss/accuracy, although smaller input images will almost certainly decrease computational time and memory footprint since the feature maps will be smaller.

like image 172
jodag Avatar answered Jan 01 '23 16:01

jodag