Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Install Tesseract for Windows 7

Tags:

python

ocr

My objective is to use OCR in Python 2.7 using Tesseract on a Windows 7 machine, but I am running into issues as for the installation process. I tried following the instruction here but the link to "tesseract-core-yyyymmdd.exe" and "tesseract-langs-yyyymmdd.exe" do not exist anymore and I can't find these .exe elsewhere online. Here's what I have done so far:

  1. installed tesseract from its executable from official tesseract-ocr page.
  2. installed via pip packages "wand", "PIL", "pyocr".

Now, if I do the following in Python:

from wand.image import Image from PIL import Image as PI import pyocr import pyocr.builders import io

No problem loading up these packages but pyocr.get_available_tools() gives me an empty list. I am sure this has to do with the missing installation .exe files above. Where can I find them? Is it something else that I am missing?

like image 897
Plug4 Avatar asked Mar 16 '17 10:03

Plug4


People also ask

How do I use Tesseract OCR in Python windows?

Learn how to import the pytesseract package into your Python scripts. Use OpenCV to load an input image from disk. Pass the image into the Tesseract OCR engine via the pytesseract library. Display the OCR'd text results on our terminal.

How do I install Tesseract from source?

To do this: Download the latest SW (Software Network https://software-network.org/client/ ) client from https://software-network.org/client/ . Checkout tesseract sources git clone https://github.com/tesseract-ocr/tesseract tesseract && cd tesseract . Run sw build .


2 Answers

I just tried to set up pytesseract and it works ! I have windows 10 and python 2.7 installed.

all you need to do :

  1. Download Visual basic C++ from http://aka.ms/vcpython27 and install it (common installation step)
  2. Download tesseract from python via this link https://pypi.python.org/pypi/pytesseract

  3. Unizip the file.

  4. Go to the directory which contains the unizip file

  5. Run this command " python setup.py install "

  6. (Additional) to test if it's installed, go to your python shell and run this command " import pytesseract "

I hope it works !! Note pytesseract is google based OCR, it works similarly to tesseract.

like image 126
Asha Magenta Avatar answered Sep 21 '22 22:09

Asha Magenta


Step [1] To install tesseract kindly visit

https://github.com/UB-Mannheim/tesseract/wiki

The latest installers can be downloaded from here: e.g., tesseract-ocr-setup-3.05.02-20180621.exe, tesseract-ocr-w32-setup-v4.0.0-beta.1.20180608.exe, tesseract-ocr-w64-setup-v4.0.0-beta.1.20180608.exe (64 bit)

Step [2] Download Microsoft Visual C++ Compiler for Python 2.7 from the link given below https://download.microsoft.com/download/7/9/6/796EF2E4-801B-4FC4-AB28-B59FBF6D907B/VCForPython27.msi

Step [3] Install pytesseract for binding for tesseract using pip

pip install pytesseract

Step [4] Furthermore you can install an image processing library in python, e.g., pillow:

pip install pillow

greetings!! you are done!! :)

like image 28
Shashank Singh Avatar answered Sep 17 '22 22:09

Shashank Singh