Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Emgu.cv's Tesseract object using incorrect path for OCR files

I wish to use EMGU.CV's Tesseract object to do OCR on some pictures. To start, I've downloaded, compiled and ran their OCR and LicensePlateRecognition examples.

However, Tesseract kept throwing the following exception:

Unable to create ocr model using Path 'teseract' and language 'eng'.

And I traced the source to the line:

_ocr = new Tesseract(@"tessdata", "eng", Tesseract.OcrEngineMode.OEM_TESSERACT_CUBE_COMBINED);

I tried fixing it with the most obvious ways: I gave it the full path, I copied the files around to just 'C: \', and I made sure that my program's current directory was the same one with the tessdata in it.

None of those worked, so I used procmon and discovered it was looking for the files here:

C: \Program Files (x86)\Tesseract-OCR\tessdata

And it seems no matter what I do I cannot change it from this location. (Moving the files there worked, of course). This location does not exist anywhere in EMGU.cv's code, so my guess is that it's compiled into Tesseract's code as some default (?).

So, how do I change Tesseract from using this location? The obvious way is that the Tesseract constructor should DO something with the path I pass into it, so what am I missing?

like image 291
DanTheMan Avatar asked Feb 29 '12 18:02

DanTheMan


3 Answers

I have tried copying files to the directory where my application runs, I have tried absolute and relative paths and I have tried using hte hard coded C: \Program Files (x86)\Tesseract-OCR\tessdata. None of them worked for me.

I got it working by doing the following:

  1. Copy tessdata folder to where my App is running
  2. Then specify an empty dataPath parameter (apparently tessdata/ is appended to dataPath by default). This code worked:

_ocr = new Tesseract("", "eng", Tesseract.OcrEngineMode.OEM_TESSERACT_CUBE_COMBINED);

like image 54
Dan Gøran Lunde Avatar answered Nov 19 '22 04:11

Dan Gøran Lunde


The first parameter is the file location. The tip-off should have been the "@" sign used to nullify the escape character "\". This is typically used for paths to avoid the \.

like image 1
William T Finnegan Avatar answered Nov 19 '22 06:11

William T Finnegan


Check if TESSDATA_PREFIX environment variable is set (delete it and restart application). I had this exact same problem...

like image 1
Sup3rHugh Avatar answered Nov 19 '22 05:11

Sup3rHugh