Image to text python

Tags:

I am using python 3.x and using the following code to convert image into text:

from PIL import Image
from pytesseract import image_to_string

image = Image.open('image.png', mode='r')
print(image_to_string(image))

I am getting the following error:

Traceback (most recent call last):
  File "C:/Users/hp/Desktop/GII/Image_to_text.py", line 12, in <module>
    print(image_to_string(image))
  File "C:\Users\hp\Downloads\WinPython-64bit-3.5.1.2\python-3.5.1.amd64\lib\site-packages\pytesseract\pytesseract.py", line 161, in image_to_string
    config=config)
  File "C:\Users\hp\Downloads\WinPython-64bit-3.5.1.2\python-3.5.1.amd64\lib\site-packages\pytesseract\pytesseract.py", line 94, in run_tesseract
    stderr=subprocess.PIPE)
  File "C:\Users\hp\Downloads\WinPython-64bit-3.5.1.2\python-3.5.1.amd64\lib\subprocess.py", line 950, in __init__
    restore_signals, start_new_session)
  File "C:\Users\hp\Downloads\WinPython-64bit-3.5.1.2\python-3.5.1.amd64\lib\subprocess.py", line 1220, in _execute_child
    startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified

Please note that I have put the image in the same directory where my python is present. Also It does not raise error on image = Image.open('image.png', mode='r') but it raises on the line print(image_to_string(image)).

Any idea what might be wrong here? Thanks

981

asked Jul 21 '16 14:07

muazfaiz

1 Answers

You have to have tesseract installed and accesible in your path.

According to source, pytesseract is merely a wrapper for subprocess.Popen with tesseract binary as a binary to run. It does not perform any kind of OCR itself.

Relevant part of sources:

def run_tesseract(input_filename, output_filename_base, lang=None, boxes=False, config=None):
    '''
    runs the command:
        `tesseract_cmd` `input_filename` `output_filename_base`

    returns the exit status of tesseract, as well as tesseract's stderr output
    '''
    command = [tesseract_cmd, input_filename, output_filename_base]

    if lang is not None:
        command += ['-l', lang]

    if boxes:
        command += ['batch.nochop', 'makebox']

    if config:
        command += shlex.split(config)

    proc = subprocess.Popen(command,
            stderr=subprocess.PIPE)
    return (proc.wait(), proc.stderr.read())

Quoting another part of source:

# CHANGE THIS IF TESSERACT IS NOT IN YOUR PATH, OR IS NAMED DIFFERENTLY
tesseract_cmd = 'tesseract'

So quick way of changing tesseract path would be:

import pytesseract
pytesseract.tesseract_cmd = "/absolute/path/to/tesseract"  # this should be done only once 
pytesseract.image_to_string(img)

132

answered Sep 27 '22 17:09

Łukasz Rogalski

Related questions
                            
                                pandas: sum two rows of dataframe without rearranging dataframe?
                            
                                boto3 cannot create client on pyspark worker?
                            
                                Using a shift() function within an apply function to compare rows in a Pandas Dataframe
                            
                                PyQT5 QComboBox - get value of combobox
                            
                                XPath select image links - parent href link of img src only if it exists, else select img src link
                            
                                python date time get the current time but with seconds and hour and minute
                            
                                Issue with requests module in python for AWS Lambda
                            
                                How to drop rows not containing string type in a column in Pandas?
                            
                                how to write setup.py to install python extention (xxx.so file) built by SWIG?
                            
                                itertools product should not contain combination having duplicate values
                            
                                How to load SVM data from file in OpenCV 3.1?
                            
                                Showing all index values when using multiIndexing in Pandas
                            
                                PyCharm and f-strings
                            
                                How to use __getattr__ to delegate methods to attribute?
                            
                                Test isolation broken with multiple databases in Django. How to fix it?
                            
                                Splitting duplicates into separate table - Pandas
                            
                                default() method in Python
                            
                                Getting all attributes to appear on python's `__dict__` method
                            
                                how to find the index for a quantile
                            
                                How to center text horizontally in a Kivy text input?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Image to text python

Tags:

python

python-3.x

pytesser

muazfaiz

People also ask

1 Answers

Łukasz Rogalski

Recent Activity

Donate For Us