Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there any particular reason why people pick 224x224 image size for imagenet experiments?

Is it that 224x224 gives better accuracy for some reason or just computational constraint? I would think that bigger picture should give better accuracy, no?

like image 345
user10024395 Avatar asked Apr 16 '17 06:04

user10024395


People also ask

How does ImageNet order its data?

To order such a massive amount of data, ImageNet actually follows the WordNet hierarchy. Each meaningful word/phrase inside WordNet is called a “synonym set” or “synset” for short. Within the ImageNet project, images are organized according to these synsets, with the goal being to have 1,000+ images per synset.

What is the difference between ImageNet and the classification challenge?

While in the context of image classification, object detection, and scene understanding, we often refer to ImageNet as the classification challenge and the dataset associated with the challenge, remember that there is also a more broad project called ImageNet where these images are collected, annotated, and organized.

Does ImageNet have images of the human spine?

I don’t believe ImageNet has images of the human spine. You should consider gathering your own dataset. Great tutorial! can you please provide step by step guide to train a classifier on own dataset using resnet in keras with tensorflow as backend?

Which is the most accurate architecture on ImageNet?

Which is the most accurate architecture on Imagenet among alexnet, resnet, Inception, Vgg? On ImageNet specifically? ResNet is typically the most accurate. [INFO] loading inception… [INFO] loading and pre-processing image… [INFO] classifying image with ‘inception’… is showing error…..? it is creating .png file with empty contents.


1 Answers

Well bigger images contain more information that could either be relevant or not. The size of your input is important because the bigger the input, the more parameters your network will have to handle. More parameters may lead to several problems, first you'll need more computing power. Then you may need more data to train on, since a lot of parameters and not enough samples may lead to overfitting, specially with CNNs. The choice for a 224 from AlexNet also allowed them to apply some data augmentation.

For instance, if you have a 512x512 image and you want to recognize an object there it would be better to resample it to 256x256 and get smaller patches of 224x224 or 200x200, do some data augmentation and then train. You could also use patches of 400x400 and also do data augmentation and train, provided that you have enough data.

Don't forget to do cross-validation so you can check if there's overfitting.

like image 86
Lucas Ramos Avatar answered Oct 29 '22 03:10

Lucas Ramos