I wish to use EMGU.CV's Tesseract object to do OCR on some pictures. To start, I've downloaded, compiled and ran their OCR and LicensePlateRecognition examples.
However, Tesseract kept throwing the following exception:
Unable to create ocr model using Path 'teseract' and language 'eng'.
And I traced the source to the line:
_ocr = new Tesseract(@"tessdata", "eng", Tesseract.OcrEngineMode.OEM_TESSERACT_CUBE_COMBINED);
I tried fixing it with the most obvious ways: I gave it the full path, I copied the files around to just 'C: \', and I made sure that my program's current directory was the same one with the tessdata in it.
None of those worked, so I used procmon and discovered it was looking for the files here:
C: \Program Files (x86)\Tesseract-OCR\tessdata
And it seems no matter what I do I cannot change it from this location. (Moving the files there worked, of course). This location does not exist anywhere in EMGU.cv's code, so my guess is that it's compiled into Tesseract's code as some default (?).
So, how do I change Tesseract from using this location? The obvious way is that the Tesseract constructor should DO something with the path I pass into it, so what am I missing?
I have tried copying files to the directory where my application runs, I have tried absolute and relative paths and I have tried using hte hard coded C: \Program Files (x86)\Tesseract-OCR\tessdata. None of them worked for me.
I got it working by doing the following:
_ocr = new Tesseract("", "eng", Tesseract.OcrEngineMode.OEM_TESSERACT_CUBE_COMBINED);
The first parameter is the file location. The tip-off should have been the "@" sign used to nullify the escape character "\". This is typically used for paths to avoid the \.
Check if TESSDATA_PREFIX
environment variable is set (delete it and restart application).
I had this exact same problem...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With