I am trying to implement YOLOv2 on my custom dataset. Is there any minimum number of images required for each class?
Label at least 50 images of houses to train the model. Label images of the same resolution quality and from the same angles as those that you plan to process with the trained model. Limit the number of objects that you want to detect to improve model accuracy for detecting those objects.
For each label you must have at least 10 images, each with at least one annotation (bounding box and the label). However, for model training purposes it's recommended you use about 1000 annotations per label. In general, the more images per label you have the better your model will perform.
Usually around 100 images are sufficient to train a class. If the images in a class are very similar, fewer images might be sufficient. the training images are representative of the variation typically found within the class.
It can detect the 20 Pascal object classes: person. bird, cat, cow, dog, horse, sheep. aeroplane, bicycle, boat, bus, car, motorbike, train.
There is no minimum images per class for training. Of course the lower number you have, the model will converge slowly and the accuracy will be low.
What important, according to Alexey's (popular forked darknet and the creator of YOLO v4) how to improve object detection is :
For each object which you want to detect - there must be at least 1 similar object in the Training dataset with about the same: shape, side of object, relative size, angle of rotation, tilt, illumination. So desirable that your training dataset include images with objects at diffrent: scales, rotations, lightings, from different sides, on different backgrounds - you should preferably have 2000 different images for each class or more, and you should train 2000*classes iterations or more
https://github.com/AlexeyAB/darknet
So I think you should have minimum 2000 images per class if you want to get the optimum accuracy. But 1000 per class is not bad also. Even with hundreds of images per class you can still get decent (not optimum) result. Just collect as many images as you can.
It depends.
There is an objective minimum of one image per class. That may work with some accuracy, in principle, if using data-augmentation strategies and fine-tuning a pretrained YOLO network.
The objective reality, however, is that you may need as many as 1000 images per class, depending on your problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With