Read what number the colored number image is to console

Tags:

So I'm trying to create a program that can see what number an image is and print the integer in the console. (I'm using python 3)

For example that the program recognizes that the following image (an actual image the program has to check) is number 2:

number 2

I've tried to just compare it with an other image with the 2 in it with cv2.matchTemplate() but each time the blue pixels rgb values are a little bit different for each image and the image could be a bit larger or smaller. for example the following image:

number 2

It also has to recognize it apart from al the other blue number images (0-9), for example the following one:

number 5

I've tried mulitple match template codes, and make a folder with number 0-9 images as templates, but each time almost every single number is recognized in the number that needs to be recognized. for example number 5 gets recognized in an image that is number 2. And if its doesnt recognize all of them, it recognizes the wrong one(s).

The ones I've tried:

Answer from this question
Both codes from this tutorial
And the one from this tutorial

but like I said before it comes with those problems.

I've also tried to see how much percentage blue is in each image, but those numbers were to close to tell the numbers appart by seeing how much blue was in them.

Does anyone have a solution? Am I being stupid for using cv2.matchTemplate() and is there a much simpler option? (I don't mind using a library for it, because this is part of a bigger piece of code, but I prefer to code it, instead of libraries)

923

asked Jan 17 '20 20:01

kaci

2 Answers

Instead of using Template Matching, a better approach is to use Pytesseract OCR to read the number with image_to_string(). But before performing OCR, you need to preprocess the image. For optimal OCR performance, the preprocessed image should have the desired text/number/characters to OCR in black with the background in white. A simple preprocessing step is to convert the image to grayscale, Otsu's threshold to obtain a binary image, then invert the image. Here's a visualization of the preprocessing step:

Input image -> Grayscale -> Otsu's threshold -> Inverted image ready for OCR

enter image description here

Result from Pytesseract OCR

2

Here's the results with the other images:

enter image description here

2

enter image description here

5

We use the --psm 6 configuration option to assume a single uniform block of text. See here for more configuration options.

Code

import cv2
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

# Load image, grayscale, Otsu's threshold, then invert
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
invert = 255 - thresh

# Perfrom OCR with Pytesseract
data = pytesseract.image_to_string(invert, lang='eng', config='--psm 6')
print(data)

cv2.imshow('thresh', thresh)
cv2.imshow('invert', invert)
cv2.waitKey()

Note: If you insist on using Template Matching, you need to use scale variant template matching. Take a look at how to isolate everything inside of a contour, scale it, and test the similarity to an image? and Python OpenCV line detection to detect X symbol in image for some examples. If you know for certain that your images are blue, then another approach would be to use color thresholding with cv2.inRange() to obtain a binary mask image then apply OCR on the image.

answered Oct 09 '22 22:10

nathancy

Given the lovely regular input, I expect that all you need is simple comparison to templates. Since you neglected to supply your code and output, it's hard to tell what might have gone wrong.

Very simply ...

Rescale your input to the size or your templates.
Calculate any straightforward matching evaluation on the input with each of the 10 templates. A simply matching count should suffice: how many pixels match between the two images.
The template with the highest score is the identification.

You might also want to set a lower threshold for declaring a match, perhaps based on how well that template matches each of the other templates: any identification has to clearly exceed the match between two different templates.

answered Oct 09 '22 21:10

Prune

Related questions
                            
                                How to set proxy AUTHENTICATION username:password using Python/Selenium
                            
                                How do you get Visual Studio Code to use different Python interpreter?
                            
                                Difference between nn.MaxPool2d vs.nn.functional.max_pool2d?
                            
                                How to style/format point markers in Plotly 3D scatterplot?
                            
                                Passing `training=true` when using Tensorflow 2's Keras Functional API
                            
                                Can python load definitions from a C header file?
                            
                                Count number of cells in the image
                            
                                No module named 'pyarrow._orc'
                            
                                How to separate files using dask groupby on a column
                            
                                How to access arrays?
                            
                                What uses the memory of my python process? (RSS vs VMS)
                            
                                ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output. while installing pip install keplergl
                            
                                BeautifulSoup not working cannot import name 'BeautifulSoup' from partially initialized module 'bs4'
                            
                                What to do with unapplied migrations in Django?
                            
                                Rendering latex/mathjax equations in django
                            
                                Cannot pickle Tensorflow object in Python - TypeError: can't pickle _thread._local objects
                            
                                How to write pandas dataframe into Databricks dbfs/FileStore?
                            
                                Reindexing only level of a MultiIndex dataframe, reindex() broken?
                            
                                I keep getting an Assertion Error with StyleGAN
                            
                                ImportError: cannot import name 'ClassVar' after installing airflow

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Read what number the colored number image is to console

Tags:

python

python-3.x

image

image-processing

opencv

kaci

People also ask

2 Answers

nathancy

Prune

Recent Activity

Donate For Us