so i recently followed this tutorial to train my own image classifier
https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/?utm_campaign=chrome_series_machinelearning_063016&utm_source=gdev&utm_medium=yt-desc#0
for those who dont know it allows the retraining of the last layer of the google inception model in order to make the prediction graph work on our own custom categories.
once i was done training i deployed the model on iOS using this tutorial
https://petewarden.com/2016/09/27/tensorflow-for-mobile-poets/
and the model works great in the wild on natural images.im acheving up to 98% accuracy on natural images. it was trained on only 2 classes. lets say it was working to just give a yes no answer to weather a calculator is present in an image or not. if the calculator is present it says yes, if not, it says no.
my question is that if it's possible to draw a bounding box on the calculator using our output graph or even a heatmap of the detection. because i need to crop the image further based on detection.
The disappointing but accurate answer is that ImageNet training only produces label(s) from an input image, not a bounding box. You would need to train a network to identify ROI. There are a few interesting papers in this SO answer that might help, the key terms being "ROI" and "Saliency Detection".
If you're desperate to reuse that pre-trained network you could try taking random sub-crops of the image and picking the smallest one that still has the correct label. I've never tried this so it might be a poor proxy.
Edit: It looks like this paper has used an image classification net to compute a saliency map. I'd follow their ideas.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With