Tesseract running error

Tags:

I have a problem with running tesseract-ocr engine on linux. I've downloaded RUS language data and put it to tessdata directory (/usr/local/share/tessdata). When I'm trying to run tesseract with command tesseract blob.jpg out -l rus , it displays an error:

Error opening data file /usr/local/share/tessdata/eng.traineddata  Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory.  Failed loading language eng Tesseract couldn't load any languages!  Could not initialize tesseract.

According to compiling guide, I used export TESSDATA_PREFIX='/usr/local/share/' to point my tessdata directory. Maybe I should edit any config files? Tesseract try to load 'eng' data files instead of 'rus'.

Screenshot: http://i.stack.imgur.com/I0Guc.png

366

asked Feb 10 '13 17:02

Russel Crowe

1 Answers

You can grab eng.traineddata Github:

wget https://github.com/tesseract-ocr/tessdata/raw/master/eng.traineddata

Check https://github.com/tesseract-ocr/tessdata for a full list of trained language data.

When you grab the file(s), move them to the /usr/local/share/tessdata folder. Warning: some Linux distributions (such as openSUSE and Ubuntu) may be expecting it in /usr/share/tessdata instead.

# If you got the data from Google, unzip it first! gunzip eng.traineddata.gz  # Move the data sudo mv -v eng.traineddata /usr/local/share/tessdata/

188

answered Oct 08 '22 04:10

AAAfarmclub

Related questions
                            
                                Android OCR Library [closed]
                            
                                What kind of OCR Java library should I use in Android? [closed]
                            
                                Extracting code from photograph of T-shirt via OCR
                            
                                Detect text area in an image using python and opencv
                            
                                Use pytesseract OCR to recognize text from an image
                            
                                Split text lines in scanned document
                            
                                Getting the bounding box of the recognized words using python-tesseract
                            
                                Pytesseract OCR multiple config options
                            
                                OCR lib for math formulas
                            
                                How to get the word under the cursor in Windows?
                            
                                How to implement and do OCR in a C# project?
                            
                                How can I implement OCR on a website using PHP? [closed]
                            
                                Converting a Vision VNTextObservation to a String
                            
                                What are good algorithms for vehicle license plate detection? [closed]
                            
                                How to make tesseract to recognize only numbers, when they are mixed with letters?
                            
                                best OCR (Optical character recognition) example in android [closed]
                            
                                How to recognize vehicle license / number plate (ANPR) from an image? [closed]
                            
                                How to get Indexing Service and MODI to produce Full-text over OCR?
                            
                                Limit characters tesseract is looking for
                            
                                How do I choose between Tesseract and OpenCV? [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Tesseract running error

Tags:

ocr

tesseract

Russel Crowe

People also ask

1 Answers

AAAfarmclub

Recent Activity

Donate For Us