Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pytesseract cannot find the file specified

My code is straight forward and is the following:

import pytesseract
from PIL import Image

img = Image.open('C:/temp/foo.jpg')
img.load()
i = pytesseract.image_to_string(img)

and the error response I get back is:

Traceback (most recent call last):
  File "img.py", line 6, in <module>
    i = pytesseract.image_to_string(img)
  File "build\bdist.win32\egg\pytesseract\pytesseract.py", line 161, in image_to
_string
  File "build\bdist.win32\egg\pytesseract\pytesseract.py", line 94, in run_tesse
ract
  File "C:\Users\%USER%\AppData\Local\Continuum\Anaconda\lib\subprocess.py",
line 710, in __init__
    errread, errwrite)
  File "C:\Users\%USER%\AppData\Local\Continuum\Anaconda\lib\subprocess.py",
line 958, in _execute_child
    startupinfo)
WindowsError: [Error 2] The system cannot find the file specified

Any guidance would be fantastic.

Adding tesseract to my path variable helped: C:\Program Files (x86)\Tesseract-OCR

But the code now crashes when trying to run the pytesseract piece.

like image 276
jason m Avatar asked Dec 11 '15 14:12

jason m


People also ask

How do you find the Pytesseract path?

Default installation path at the time of this edit was: C:\Users\USER\AppData\Local\Tesseract-OCR. It may change so please check the installation path.

How do I import Pytesseract into Jupyter notebook?

Create a Python script (a . py-file), or start up a Jupyter notebook. At the top of the file, import pytesseract , then point pytesseract at the tesseract installation you discovered in the previous step. Note the r' ' at the start of the string that defines the file location.


1 Answers

Just hit the same error and decided to answer this question - it might help someone to save time...

First, make sure you have installed/copied Tesseract-OCR executables.

Windows can't find the executable tesseract in the directories specified in your PATH environment variable. So either make sure that the directory containing tesseract is in your PATH variable or overwrite tesseract_cmd variable in your Python script like as following (put your PATH instead):

import pytesseract

pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract'

Beside that make sure that TESSDATA_PREFIX Windows environment variable is set to the directory, containing tessdata directory. For example:

TESSDATA_PREFIX=C:\Program Files (x86)\Tesseract-OCR

if tessdata location is: C:\Program Files (x86)\Tesseract-OCR\tessdata

like image 125
MaxU - stop WAR against UA Avatar answered Sep 19 '22 13:09

MaxU - stop WAR against UA