New to machine learning so looking for some direction how to get started. The end goal is to be able to train a model to count the number of objects in an image using Tensorflow. My initial focus will be to train the model to count one specific type of object. So lets say I take coins. I will only train the model to count coins. Not worried about creating a generic counter for all different types of objects. I've only done Google's example of image classification of flowers and I understand the basics of that. So looking for clues how to get started. Is this an image classification problem and I can use the same logic as the flowers...etc etc?
To count the images one has to make use of computer vision libraries. There are tones of libraries available to achieve the aim of the tutorial. But today in this tutorial, we will be making use of the cvlib library which is very simple, easy, and a high-level library in Python.
You can use the list class count function to count the occurrences of an object in a Python list. Use this only if you want the count of one object only. It finds the total number of the object you pass it in the list it is called on.
Probably the best performing solution for the coin problem would be to use a regression to solve this. Annotate 5k images with the amount of objects in the scene and run your model on it. Then your model just outputs the correct number. (Hopefully)
Another way is to classify if an image shows a coin and use a sliding window approach like this one: https://arxiv.org/pdf/1312.6229.pdf to classify for each window if it shows a coin. Then you count the found regions. This one is easier to annotate and learn and better extensible. But you have the problem of choosing good windows and using the result of those windows in a concise way.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With