Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OpenCv pytesseract for OCR

Tags:

python

opencv

How to use opencv and pytesseract to extract text from image?

import cv2

import pytesseract from PIL import Image import numpy as np from matplotlib import pyplot as plt

img = Image.open('test.jpg').convert('L')
img.show()
img.save('test','png')
img = cv2.imread('test.png',0)
edges = cv2.Canny(img,100,200)
#contour = cv2.findContours(edges, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
#print pytesseract.image_to_string(Image.open(edges))
print pytesseract.image_to_string(edges)

But this is giving error-

Traceback (most recent call last): File "open.py", line 14, in print pytesseract.image_to_string(edges) File "/home/sroy8091/.local/lib/python2.7/site-packages/pytesseract/pytesseract.py", line 143, in image_to_string if len(image.split()) == 4: AttributeError: 'NoneType' object has no attribute 'split'

like image 909
sumitroy Avatar asked Feb 23 '26 21:02

sumitroy


1 Answers

If you like to do some pre-processing using opencv (like you did some edge detection) and later on if you wantto extract text, you can use this command,

# All the imports and other stuffs goes here
img = cv2.imread('test.png',0)
edges = cv2.Canny(img,100,200)
img_new = Image.fromarray(edges)
text = pytesseract.image_to_string(img_new, lang='eng')
print (text)
like image 80
Deepan Raj Avatar answered Feb 25 '26 09:02

Deepan Raj



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!