Use Azure Machine learning to detect symbol within an image

Tags:

4 years ago I posted this question and got a few answers that were unfortunately outside my skill level. I just attended a build tour conference where they spoke about machine learning and this got me thinking of the possibility of using ML as a solution to my problem. i found this on the azure site but i dont think it will help me because its scope is pretty narrow.

Here is what i am trying to achieve:

i have a source image:

source image

and i want to which one of the following symbols (if any) are contained in the image above:

symbols

the compare needs to support minor distortion, scaling, color differences, rotation, and brightness differences.

the number of symbols to match will ultimately at least be greater than 100.

is ML a good tool to solve this problem? if so, any starting tips?

401

asked Jun 16 '15 05:06

josh

1 Answers

As far as I know, Project Oxford (MS Azure CV API) wouldn't be suitable for your task. Their APIs are very focused to Face related tasks (detection, verification, etc), OCR and Image description. And apparently you can't extend their models or train new ones from the existing ones.

However, even though I don't know an out of the box solution for your object detection problem; there are easy enough approaches that you could try and that would give you some start point results.

For instance, here is a naive method you could use:

1) Create your dataset: This is probably the more tedious step and paradoxically a crucial one. I will assume you have a good amount of images to work with. What would you need to do is to pick a fixed window size and extract positive and negative examples. enter image description here

If some of the images in your dataset are in different sizes you would need to rescale them to a common size. You don't need to get too crazy about the size, probably 30x30 images would be more than enough. To make things easier I would turn the images to gray scale too.

2) Pick a classification algorithm and train it: There is an awful amount of classification algorithms out there. But if you are new to machine learning I will pick the one I would understand the most. Keeping that in mind, I would check out logistic regression which give decent results, it's easy enough for starters and have a lot of libraries and tutorials. For instance, this one or this one. At first I would say to focus in a binary classification problem (like if there is an UD logo in the picture or not) and when you master that one you can jump to the multi-class case. There are resources for that too or you can always have several models one per logo and run this recipe for each one separately.

To train your model, you just need to read the images generated in the step 1 and turn them into a vector and label them accordingly. That would be the dataset that will feed your model. If you are using images in gray scale, then each position in the vector would correspond to a pixel value in the range 0-255. Depending on the algorithm you might need to rescale those values to the range [0-1] (this is because some algorithms perform better with values in that range). Notice that rescaling the range in this case is fairly easy (new_value = value/255).

You also need to split your dataset, reserving some examples for training, a subset for validation and another one for testing. Again, there are different ways to do this, but I'm keeping this answer as naive as possible.

3) Perform the detection: So now let's start the fun part. Given any image you want to run your model and produce coordinates in the picture where there is a logo. There are different ways to do this and I will describe one that probably is not the best nor the more efficient, but it's easier to develop in my opinion.

You are going to scan the picture, extracting the pixels in a "window", rescaling those pixels to the size you selected in step 1 and then feed them to your model.

Extracting windows to feed the model

If the model give you a positive answer then you mark that window in the original image. Since the logo might appear in different scales you need to repeat this process with different window sizes. You also would need to tweak the amount of space between windows.

4) Rinse and repeat: At the first iteration it's very likely that you will get a lot of false positives. Then you need to take those as negative examples and retrain your model. This would be an iterative process and hopefully on each iteration you will have less and less false positives and fewer false negatives.

Once you are reasonable happy with your solution, you might want to improve it. You might want to try other classification algorithms like SVM or Deep Learning Artificial Neural Networks, or to try better object detection frameworks like Viola-Jones. Also, you will probably need to use crossvalidation to compare all your solutions (you can actually use crossvalidation from the beginning). By this moment I bet you would be confident enough that you would like to use OpenCV or another ready to use framework in which case you will have a fair understanding of what is going on under the hood.

Also you could just disregard all this answer and go for an OpenCV object detection tutorial like this one. Or take another answer from another question like this one. Good luck!

answered Oct 24 '22 01:10

Pedrom

Related questions
                            
                                Detecting edges of a card with rounded corners
                            
                                OpenCV: get perspective matrix from translation & rotation
                            
                                undefined reference to symbol '_ZNSt8ios_base4InitD1Ev@@GLIBCXX_3.4' building OpenCV on Ubuntu
                            
                                How to read UMat from a file in opencv 3.0 Beta?
                            
                                python cv2.Videocapture() does not work, cap.isOpened() returns false
                            
                                Dice face value recognition
                            
                                Difference between adaptive thresholding and normal thresholding in opencv
                            
                                Displacement Map Filter in OpenCV
                            
                                Image preprocessing for text recognition
                            
                                OpenCV: convertTo returns white image (sometimes)
                            
                                Examples of Matlab to OpenCV conversions
                            
                                OpenCv with Android studio 1.3+ using new gradle - undefined reference
                            
                                Detecting hair in a portrait image?
                            
                                OpenCV and python/virtualenv?
                            
                                Visual Studio 2012: 'opencv2/opencv.hpp' : No such file or directory (C1083)
                            
                                How to detect bullet holes on the target
                            
                                Remove noisy lines from an image
                            
                                Cant load a picture with java open cv
                            
                                Why does openCV's convertto function not work?
                            
                                How can I remap a point after an image rotation?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Use Azure Machine learning to detect symbol within an image

Tags:

image-processing

opencv

machine-learning

azure

azure-machine-learning-studio

josh

People also ask

1 Answers

Pedrom

Recent Activity

Donate For Us