Is it possible to get the font of the recognized characters with Tesseract-OCR, i.e. are they Arial or Times New Roman, either from the command-line or using the API. I'm scanning documents that might have different parts with different fonts, and it would be useful to have this information.

Tesseract has an API <code>WordFontAttributes</code> function defined in <code>ResultIterator</code> class that you can use.

Get font of recognized character with Tesseract-OCR

1 Answers

Tesseract has an API WordFontAttributes function defined in ResultIterator class that you can use.

192

answered Sep 29 '22 10:09

nguyenq

Related questions
                            
                                Train tesseract to one specific font
                            
                                OCR and word reviewing
                            
                                OpenCV Gaussian blur breaks Tesseract?
                            
                                Tesseract or any other OCR lib
                            
                                How to install tesserocr on windows?
                            
                                Delete OCR word from Image (OpenCV,Python)
                            
                                No such file or directory: 'tesseract': 'tesseract' even though where to find tesseract is specified in pytesseract.py
                            
                                How to configure Tesseract in Eclipse for Android development?
                            
                                Including Tess4J to a Java project as library in Eclipse
                            
                                Tesseract running error on Xcode
                            
                                Binarization and Background Filtering in opencv
                            
                                Camera Preview and OCR
                            
                                Training tesseract to use with iPhone
                            
                                Does Tesseract's hOCR output really contain bounding boxes and confidence levels for each character?
                            
                                image_to_string doesn't work in Mac
                            
                                Is Tesseract(an OCR engine) reentrant?
                            
                                Converting Images to Black and White for Image Recognition in R
                            
                                UnicodeDecodeError with Tesseract OCR in Python
                            
                                Could not initialize Tesseract API with language=eng
                            
                                Using C API of tesseract 3.02 with ctypes and cv2 in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Get font of recognized character with Tesseract-OCR

Tags:

tesseract

sashoalm

People also ask

1 Answers

nguyenq

Recent Activity

Donate For Us