Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

tesseract not recognize one number image

Tags:

tesseract

i am using tesseract with python. It recognizes almost all of my images with 2 or more numbers or characteres. But tesseract can't recognizes image with only one number. I tried to use the command line, and it's giving me "empty page" as response.

I don't want to train tesseract with "only digits" because i am recognizing characters too.

What is the problem?

Below the image that its not recognized by tesseract.

enter image description here

Code:

 #getPng(pathImg, '3') -> creates the path to the figure.
 pytesseract.image_to_string( Image.open(getPng(pathImg, '3')) 
like image 491
Luiza Rodrigues Avatar asked Mar 26 '18 20:03

Luiza Rodrigues


People also ask

Why is the Tesseract OCR not accurate?

Inevitably, noise in an input image, non-standard fonts that Tesseract wasn't trained on, or less than ideal image quality will cause Tesseract to make a mistake and incorrectly OCR a piece of text.

Can tesseract recognize numbers?

Python Tesseract 4.0 OCR: Recognize only Numbers / Digits and exclude all other Characters. Googles Tesseract (originally from HP) is one of the most popular, free Optical Character Recognition (OCR) software out there. It can be used with several programming languages because many wrappers exist for this project.

What is the difference between Pytesseract and tesseract?

Tesserocr is a Python wrapper around the Tesseract C++ API. Whereas Pytesseract is a wrapper for the tesseract-ocr CLI. Therefore with Tesserocr you can load the model at the beginning or your program, and run the model separately (for example in loops to process videos).


2 Answers

If you add the parameter --psm 13 it should works, because it will consider it as a raw text line, without searching for pages and paragraphs.

So try:

pytesseract.image_to_string(PATH, config="--psm 13") 
like image 125
sinecode Avatar answered Sep 24 '22 20:09

sinecode


Try converting image into gray-scale and then to binary image, then most probably it will read. If not duplicate the image , then you have 2 letters to read. So simply you can extract single letter

like image 27
Ashane.E Avatar answered Sep 24 '22 20:09

Ashane.E