How to set tessedit_write_images in python-tesseract?

Tags:

tesseract

python-tesseract

I'm trying to set tessedit_write_images but can't seem to do it, i can't see the tessinput.tif anywhere

i'm doing:

import tesseract

api = tesseract.TessBaseAPI()
api.Init(".","eng",tesseract.OEM_TESSERACT_ONLY)
api.SetPageSegMode(tesseract.PSM_AUTO_OSD)
api.SetVariable("tessedit_write_images", "T")

but i've tried with "True", "1", and some more variations, doesn't seem to work at all.

Any help?

978

asked Jul 22 '15 10:07

tiagosilva

1 Answers

tessedit_write_images is checked only once in Tesseract's source code (by TessBaseAPI::ProcessPage(), see here).

So you have two ways:

Call api.GetThresholdedImage(), and the returned image is what will be saved if you set the variable and call ProcessPage.
Just call api.ProcessPage(), and it will see the variable and output the tif.

answered Sep 21 '22 14:09

cortex42

Related questions
                            
                                UnicodeDecodeError with Tesseract OCR in Python
                            
                                Could not initialize Tesseract API with language=eng
                            
                                Using C API of tesseract 3.02 with ctypes and cv2 in python
                            
                                Get font of recognized character with Tesseract-OCR
                            
                                How to pass OpenCV image to Tesseract in python?
                            
                                unicharset_extractor: command not found
                            
                                How does card.io image processing work?
                            
                                Tesseract installation in Google colaboratory
                            
                                Tess4j unsatisfied link error on mac OS X
                            
                                Can tesseract be trained for non-font symbols?
                            
                                How to separate title and headers from body text in image
                            
                                Tesseract OCR only detect user-words

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With