Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mobilenet SSD Input Image Size

I would like to train a Mobilenet SSD Model on a custom dataset.

I have looked into the workflow of retraining a model and noticed the image_resizer{} block in the config file:

https://github.com/tensorflow/models/blob/d6d0868209833e014074d6cb4f32558e7acf2a6d/research/object_detection/samples/configs/ssd_mobilenet_v1_pets.config#L43

Does the aspect ratio here have to be 1:1 like 300x300 or can I specify a custom ratio?

All my dataset images are 960x256 - so could I just input this size for height and width? Or do I need to resize all the images to have an aspect ratio of 1:1?

like image 918
Tesla. Avatar asked Feb 22 '18 14:02

Tesla.


People also ask

Is mobilenet an SSD?

The mobilenet-ssd model is a Single-Shot multibox Detection (SSD) network intended to perform object detection. This model is implemented using the Caffe* framework.

How do I train my mobilenet SSD?

How To Train the SSD-Mobilenet Model? After downloading your dataset, you can move on to train the model by running train_ssd.py script. --batch-size: How many images are processed at once.

What is mobilenet SSD object detection?

Mobilenet SSD is an object detection model that computes the output bounding box and class of an object from an input image. This Single Shot Detector (SSD) object detection model uses Mobilenet as the backbone and can achieve fast object detection optimized for mobile devices.

What is mobilenet SSD architecture?

MobileNet is a lightweight deep neural network architecture designed for mobiles and embedded vision applications. MobileNet architecture. In many real-world applications, such as a self-driving car, the recognition tasks need to be carried out in a timely fashion on a computationally limited device.


1 Answers

Choose the height and width, in the model file (as per your link), to be the shape of the input image at which you want your model to train and operate. The model will resize input images to the specified size, if it has to.

So this could be the size of your input images (if your hardware can train and operate a model at that size):

image_resizer {
    fixed_shape_resizer {
        height: 256
        width: 960
    }
}

The choice will depend on the size of the training images and the resources required to train (and use) that size of model.

I typically use 512x288 as this size model runs happily on a Raspberry Pi. I prepare training images, at a variety of scales, at exactly this size. So the image resizer does no work during training.

For inference, I input images at 1920x1080, so the image resizer scales them to 512x288 before they pass into the Mobilenet, maintaining the aspect ratio.

However, the aspect ratio is not important in my domain since such distortions occur naturally.

So yes, just use your training image dimensions.

like image 146
Alaric Dobson Avatar answered Oct 21 '22 22:10

Alaric Dobson