Is it that 224x224 gives better accuracy for some reason or just computational constraint? I would think that bigger picture should give better accuracy, no?
To order such a massive amount of data, ImageNet actually follows the WordNet hierarchy. Each meaningful word/phrase inside WordNet is called a “synonym set” or “synset” for short. Within the ImageNet project, images are organized according to these synsets, with the goal being to have 1,000+ images per synset.
While in the context of image classification, object detection, and scene understanding, we often refer to ImageNet as the classification challenge and the dataset associated with the challenge, remember that there is also a more broad project called ImageNet where these images are collected, annotated, and organized.
I don’t believe ImageNet has images of the human spine. You should consider gathering your own dataset. Great tutorial! can you please provide step by step guide to train a classifier on own dataset using resnet in keras with tensorflow as backend?
Which is the most accurate architecture on Imagenet among alexnet, resnet, Inception, Vgg? On ImageNet specifically? ResNet is typically the most accurate. [INFO] loading inception… [INFO] loading and pre-processing image… [INFO] classifying image with ‘inception’… is showing error…..? it is creating .png file with empty contents.
Well bigger images contain more information that could either be relevant or not. The size of your input is important because the bigger the input, the more parameters your network will have to handle. More parameters may lead to several problems, first you'll need more computing power. Then you may need more data to train on, since a lot of parameters and not enough samples may lead to overfitting, specially with CNNs. The choice for a 224 from AlexNet also allowed them to apply some data augmentation.
For instance, if you have a 512x512 image and you want to recognize an object there it would be better to resample it to 256x256 and get smaller patches of 224x224 or 200x200, do some data augmentation and then train. You could also use patches of 400x400 and also do data augmentation and train, provided that you have enough data.
Don't forget to do cross-validation so you can check if there's overfitting.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With