How to represent: <ol> <li>Create new image with paint (any size)</li> <li>Add letter A to this image</li> <li>Try to recognize -> tesseract will not find any letters</li> <li>Copy-paste this letter 5-6 times to this image</li> <li>Try to recognize -> tesseract will find all the letters</li> </ol> Why?

You must set the "page segmentation mode" to "single char". For example, in Android you do the following: <pre class="prettyprint"><code>api.setPageSegMode(TessBaseAPI.pageSegMode.PSM_SINGLE_CHAR); </code></pre>

python code to do that configuration is like this: <pre class="prettyprint"><code>import pytesseract import cv2 img = cv2.imread("path to some image") pytesseract.image_to_string( img, config=("-c tessedit" "_char_whitelist=abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789" " --psm 10" " -l osd" " ")) </code></pre> the <code>--psm</code> flag defines the page segmentation mode. according to documentaion of tesseract, <code>10</code> means : <blockquote> Treat the image as a single character. </blockquote> so to recognize a single character you just need to use : <code>--psm 10</code> flag.

Tesseract does not recognize single characters

2 Answers

You must set the "page segmentation mode" to "single char".

For example, in Android you do the following:

api.setPageSegMode(TessBaseAPI.pageSegMode.PSM_SINGLE_CHAR);

140

answered Oct 21 '22 04:10

Marco Bonifazi

python code to do that configuration is like this:

import pytesseract
import cv2
img = cv2.imread("path to some image")
pytesseract.image_to_string(
     img, config=("-c tessedit"
                  "_char_whitelist=abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
                  " --psm 10"
                  " -l osd"
                  " "))

the --psm flag defines the page segmentation mode.

according to documentaion of tesseract, 10 means :

Treat the image as a single character.

so to recognize a single character you just need to use : --psm 10 flag.

answered Oct 21 '22 04:10

Shahryar Saljoughi

Related questions
                            
                                Tesseract OCR Library - Learning Font
                            
                                Convert Non-Searchable Pdf to Searchable Pdf in Windows Python
                            
                                What's the best way to ocr as much text as possible from video game screenshots?
                            
                                Open source OCR [closed]
                            
                                Google Cloud Vision - Numbers and Numerals OCR
                            
                                Batch OCR Program for PDFs [closed]
                            
                                Get correct image orientation by Google Cloud Vision api (TEXT_DETECTION)
                            
                                WinError 5:Access denied PyTesseract
                            
                                Select only specific parts of the image
                            
                                Preprocessing poorly scanned handwritten digits
                            
                                Text detection on Seven Segment Display via Tesseract OCR
                            
                                Tesseract OCR fails to detect varying font size and letters that are not horizontally aligned
                            
                                How to extract text from image Android app
                            
                                Stroke Width Transform (SWT) implementation (Python)
                            
                                How can I use Tesseract in Android?
                            
                                Can I do a "string contains X" with a percentage accuracy in python?
                            
                                Tesseract confuses two numbers
                            
                                Handwriting recognition API's for android applications [closed]
                            
                                Google ML Kit: Waiting for the text recognition model to be downloaded

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Tesseract does not recognize single characters

Tags:

ocr

tesseract

artem

People also ask

2 Answers

Marco Bonifazi

Shahryar Saljoughi

Recent Activity

Donate For Us