Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tesseract installation in Google colaboratory

I have installed tesseract in Google colab using the command

!pip install tesseract

But when I run the command

text = pytesseract.image_to_string(Image.open('cropped_img.png'))

I get the below error:

TesseractNotFoundError: tesseract is not installed or it's not in your path

like image 370
Prosenjit Avatar asked Aug 05 '18 17:08

Prosenjit


People also ask

How do I install Tesseract source code?

This is a proven build sequence: cd tesseract ./autogen.sh mkdir -p bin/release cd bin/release ../../configure --disable-openmp --disable-shared 'CXXFLAGS=-g -O2 -fno-math-errno -Wall -Wextra -Wpedantic' # Build tesseract and training tools. Run `make` if you don't need the training tools. make training cd ../..


2 Answers

Add pytesseract.pytesseract.tesseract_cmd = r'/usr/local/bin/pytesseract'

This should solve the TesseractNotFoundError.

like image 61
Standerwahre Avatar answered Sep 19 '22 20:09

Standerwahre


There could be a number of reasons for this, but normally it is because you do not have the C library available for tesseract. Even though pytesseract is required, it is only half of the solution.

You essentially need to install both the tesseract package for linux, along with the Python binding.

This would essentially be the solution:

! apt install tesseract-ocr
! apt install libtesseract-dev

The above installs the required dependencies for pytesseract. This is very important, especially the ! without which you cannot install directly to the underlying operating system.

The remainder of the process is relatively simple:

! pip install Pillow
! pip install pytesseract

This installs the Python binding.

The remainder is fairly simple and all you need to do is import!

import pytesseract
from PIL import ImageEnhance, ImageFilter, Image

Then you can let the magic happen.

Hopefully this helps someone.

like image 32
Srivats Shankar Avatar answered Sep 23 '22 20:09

Srivats Shankar