pytesseract using tesseract 4.0 numbers only not working

Tags:

3 Answers

You can specify the numbers in the tessedit_char_whitelist as below as a config option.

ocr_result = pytesseract.image_to_string(image, lang='eng', boxes=False, \
           config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')

Hope this help.

126

answered Sep 28 '22 10:09

thewaywewere

Using tessedit_char_whitelist flags with pytesseract did not work for me. However, one workaround is to use a flag that works, which is config='digits':

import pytesseract
text = pytesseract.image_to_string(pixels, config='digits')

where pixels is a numpy array of your image (PIL image should also work). This should force your pytesseract into returning only digits. Now, to customize what it returns, find your digits configuration file, on Windows mine was located here:

C:\Program Files (x86)\Tesseract-OCR\tessdata\configs

Open the digits file and add whatever characters you want. After saving and running pytesseract, it should return only those customized characters.

answered Sep 28 '22 12:09

Robert Harris

You can specify the numbers in the tessedit_char_whitelist as below as a config option.

ocr_result = pytesseract.image_to_string(image, lang='eng',config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')

answered Sep 28 '22 11:09

Tejesh Teju

Related questions
                            
                                Draggable line with draggable points
                            
                                Equal Error Rate in Python
                            
                                How to list all unused jenkins plugins?
                            
                                Python, how to enable all warnings?
                            
                                Can't open video using opencv
                            
                                Django: show the count of related objects in admin list_display
                            
                                OSError: dlopen(libSystem.dylib, 6): image not found
                            
                                How to get boxplot data for matplotlib boxplots
                            
                                Does GridSearchCV store all the scores for all parameter combinations?
                            
                                Django and 'virtualenv' - proper project structure
                            
                                Subprocess timeout failure
                            
                                Add a new sheet to a existing workbook in python
                            
                                How to generate a unique auth token in python?
                            
                                Why is Collections.counter so slow?
                            
                                Retry function in Python
                            
                                Rename nested field in spark dataframe
                            
                                Weighted random sample without replacement in python
                            
                                Complete search algorithm for combinations of coins
                            
                                How to update plot title with matplotlib using animation?
                            
                                python pandas.Series.isin with case insensitive

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

pytesseract using tesseract 4.0 numbers only not working

Tags:

python

tesseract

CuriousGeorge

People also ask

3 Answers

thewaywewere

Robert Harris

Tejesh Teju

Recent Activity

Donate For Us