Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PyTesser simple usage error

Tags:

python

ocr

I've downloaded PyTesser and extracted it.

I was in the pytesser_v0.0.1 folder and tried to run the sample usage code in the python interpreter:

from pytesser import *
print image_file_to_string('fnord.tif')

and the output:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pytesser.py", line 44, in image_file_to_string
    call_tesseract(filename, scratch_text_name_root)
  File "pytesser.py", line 21, in call_tesseract
    proc = subprocess.Popen(args)
  File "/usr/lib/python2.7/subprocess.py", line 679, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1259, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

NOTE: I'm in Ubuntu 12.10 with Python 2.7.3

can anyone help me understand this error, and what can I do to fix it ?

like image 408
Ghilas BELHADJ Avatar asked Aug 19 '13 20:08

Ghilas BELHADJ


People also ask

Can tesseract detect language?

Unfortunately tesseract does not have a feature to detect language of the text in an image automatically. An alternative solution is provided by another python module called langdetect which can be installed via pip.

How tesseract OCR works?

Tesseract tests the text lines to determine whether they are fixed pitch. Where it finds fixed pitch text, Tesseract chops the words into characters using the pitch, and disables the chopper and associator on these words for the word recognition step.

Where tesseract is installed?

Once installed, the training files will be on your C drive, likely in 'C:\Program Files (x86)\Tesseract-OCR'. The folder will be called 'Tesseract-Master'. You will need to unpack the files using a programme like 7-zip.


1 Answers

This isn't as well documented as it could be, but if you are not on Windows you need to install the tesseract binary for your platform. On Ubuntu and other Debian based Linux distributions, apt-get install tesseract-ocr. Then you can run:

python pytesser.py

which uses the test files phototest.tif, fnord.tif and fonts_test.png to test the library.

like image 198
Paulo Almeida Avatar answered Sep 27 '22 17:09

Paulo Almeida