Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read what number the colored number image is to console

So I'm trying to create a program that can see what number an image is and print the integer in the console. (I'm using python 3)

For example that the program recognizes that the following image (an actual image the program has to check) is number 2:

number 2

I've tried to just compare it with an other image with the 2 in it with cv2.matchTemplate() but each time the blue pixels rgb values are a little bit different for each image and the image could be a bit larger or smaller. for example the following image:

number 2

It also has to recognize it apart from al the other blue number images (0-9), for example the following one:

number 5

I've tried mulitple match template codes, and make a folder with number 0-9 images as templates, but each time almost every single number is recognized in the number that needs to be recognized. for example number 5 gets recognized in an image that is number 2. And if its doesnt recognize all of them, it recognizes the wrong one(s).

The ones I've tried:

  • Answer from this question
  • Both codes from this tutorial
  • And the one from this tutorial

but like I said before it comes with those problems.

I've also tried to see how much percentage blue is in each image, but those numbers were to close to tell the numbers appart by seeing how much blue was in them.

Does anyone have a solution? Am I being stupid for using cv2.matchTemplate() and is there a much simpler option? (I don't mind using a library for it, because this is part of a bigger piece of code, but I prefer to code it, instead of libraries)

like image 923
kaci Avatar asked Jan 17 '20 20:01

kaci


People also ask

How do you color console logs?

To style the logs, you should place %c within the first argument of console. log(). It will pick up the next argument as a CSS style for the “%c” pattern argument text.

How do you color text in C#?

To change the Foreground Color of text, use the Console. ForegroundColor property in C#.


2 Answers

Instead of using Template Matching, a better approach is to use Pytesseract OCR to read the number with image_to_string(). But before performing OCR, you need to preprocess the image. For optimal OCR performance, the preprocessed image should have the desired text/number/characters to OCR in black with the background in white. A simple preprocessing step is to convert the image to grayscale, Otsu's threshold to obtain a binary image, then invert the image. Here's a visualization of the preprocessing step:

Input image -> Grayscale -> Otsu's threshold -> Inverted image ready for OCR

enter image description here enter image description here enter image description here enter image description here

Result from Pytesseract OCR

2

Here's the results with the other images:

enter image description here enter image description here enter image description here enter image description here

2

enter image description here enter image description here enter image description here enter image description here

5

We use the --psm 6 configuration option to assume a single uniform block of text. See here for more configuration options.

Code

import cv2
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

# Load image, grayscale, Otsu's threshold, then invert
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
invert = 255 - thresh

# Perfrom OCR with Pytesseract
data = pytesseract.image_to_string(invert, lang='eng', config='--psm 6')
print(data)

cv2.imshow('thresh', thresh)
cv2.imshow('invert', invert)
cv2.waitKey()

Note: If you insist on using Template Matching, you need to use scale variant template matching. Take a look at how to isolate everything inside of a contour, scale it, and test the similarity to an image? and Python OpenCV line detection to detect X symbol in image for some examples. If you know for certain that your images are blue, then another approach would be to use color thresholding with cv2.inRange() to obtain a binary mask image then apply OCR on the image.

like image 52
nathancy Avatar answered Oct 09 '22 22:10

nathancy


Given the lovely regular input, I expect that all you need is simple comparison to templates. Since you neglected to supply your code and output, it's hard to tell what might have gone wrong.

Very simply ...

  • Rescale your input to the size or your templates.
  • Calculate any straightforward matching evaluation on the input with each of the 10 templates. A simply matching count should suffice: how many pixels match between the two images.
  • The template with the highest score is the identification.

You might also want to set a lower threshold for declaring a match, perhaps based on how well that template matches each of the other templates: any identification has to clearly exceed the match between two different templates.

like image 34
Prune Avatar answered Oct 09 '22 21:10

Prune