I was wondering if there is any benefit to training on high resolution images rather than low resolution. I understand that it will take longer to train on larger images and that the dimensions must be a multiple of 32. My current image set is 1440x1920. Would I be better off resizing to 480x640, or is bigger better?
It's certainly not a requirement that your images be powers of two. There may be some cases where it speeds things up (e.g. GPU allocation) but it's not critical.
Smaller images will train significantly faster, and possibly even converge quicker (all other factors held constant) as you will be able to train on bigger batches (e.g. 100-1000 images in one pass, which you might not be able to do on a single machine with high res imagery).
As to whether to resize, you need to ask yourself if every pixel in that image is critical to your task. Often this is not the case - you can probably resize a photo of a bus down to say 128x128 and still recognize that it's a bus.
Using smaller images can also help your network generalise better, too, as there is less data to overfit.
A technique often used in image classification networks is to perform distortions (e.g. random cropping, scaling & brightness adjustment) on images to (a) convert odd-sized images to a constant size, (b) synthesize more data and (c) encourage the network to generalise.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With