I am a newbie in TensorFlow.
Currently, I am testing some classification's examples "Convolutional Neural Network" in the TensorFlow website, and it explains how to classify input images into pre-defined classes, but the problem is: I can't figure out how to locate multiple objects in the same image. For example, I had an input image with a cat and dog and I want my graph to display in the output that there are both of them "a cat and a dog" in the image.
Great question. Detecting multiple objects in the same image boils is essentially a "segmentation problem". Two nice and popular algorithms are YOLO (You Only Look Once), and SSD(Single Shot Multibox Detector). I included links to them at the bottom.
I would watch a few videos on how YOLO works, and see if you grasp the idea. Then read the paper on SSD, and see if you get why this algorithm is even faster and more precise.
Both algorithms are single-pass: they only look at the image "once" and predict bounding boxes for the categories they spot. There are more precise algorithms, but they are slower (they first pick many spots they want to look, and then run a classifier on only that spot. The result is that they run this classifier many times per image, which is slow).
As you stated you are a newbie to Tensorflow, you can try this code other people made: https://github.com/thtrieu/darkflow . The very extensive readme shows you how to get started on your own dataset.
Good luck, and let us know if you have other questions, or if these algorithms do not fit your use-case.
A naive approach for what you are trying to do would be to classify parts of the image independently.
But there are some better techniques for object detection. Actually, there is the TensorFlow Object Detection API, which gives you access to the most common object detection methods like Faster R-CNN or SSD.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With