I think this issue is only related to Tesseract 4 which comes with LSTM support. As I am using a 64-bit Windows System, I have downloaded 64-bit windows executable from here - https://github.com/UB-Mannheim/tesseract/wiki
It has the following OCR Engine modes:
It works with all the modes except 2.
tesseract --oem 1 1.jpg 1
Result:
Tesseract Open Source OCR Engine v4.0.0.20190314 with Leptonica
Warning: Invalid resolution 0 dpi. Using 70 instead.
Estimating resolution as 561
Detected 5 diacritics
and creates a file 1.txt with corresponding OCR result.
tesseract --oem 2 1.jpg 1
Result:
Failed loading language 'eng'
Tesseract couldn't load any languages!
Could not initialize tesseract.
and no output is generated.
I thought the error will be with language installation but
tesseract --list-langs
which gave me the following result
List of available languages (2):
eng
osd
I even manually checked the tessdata folder, here is the screenshot of the same
which clearly states I already have eng language.
Can anyone help me with the exact problem that is disallowing me use Legacy + LSTM engines (--oem 2) mode.
Yes, you have eng language, but with LSTM support only. If you want to have LSTM&Legacy support you need to download data from tessdata repository
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With