I need to use Pytesseract to extract text from this picture: <img src="https://i.stack.imgur.com/HWLay.gif" alt="enter image description here"> and the code: <pre class="prettyprint"><code>from PIL import Image, ImageEnhance, ImageFilter import pytesseract path = 'pic.gif' img = Image.open(path) img = img.convert('RGBA') pix = img.load() for y in range(img.size[1]): for x in range(img.size[0]): if pix[x, y][0] < 102 or pix[x, y][1] < 102 or pix[x, y][2] < 102: pix[x, y] = (0, 0, 0, 255) else: pix[x, y] = (255, 255, 255, 255) img.save('temp.jpg') text = pytesseract.image_to_string(Image.open('temp.jpg')) # os.remove('temp.jpg') print(text) </code></pre> and the "temp.jpg" is <img src="https://i.stack.imgur.com/gorYf.jpg" alt="enter image description here"> Not bad, but the result of print is <code>,2 WW</code> Not the right text<code>2HHH</code>, so how can I remove those black dots?

Here's a simple approach using OpenCV and Pytesseract OCR. To perform OCR on an image, its important to preprocess the image. The idea is to obtain a processed image where the text to extract is in black with the background in white. To do this, we can convert to grayscale, apply a slight Gaussian blur, then Otsu's threshold to obtain a binary image. From here, we can apply morphological operations to remove noise. Finally we invert the image. We perform text extraction using the <code>--psm 6</code> configuration option to assume a single uniform block of text. Take a look here for more options. <hr> Here's a visualization of the image processing pipeline: Input image <img src="https://i.stack.imgur.com/XZ8xg.png" alt="enter image description here"> Convert to grayscale <code>-></code> Gaussian blur <code>-></code> Otsu's threshold <img src="https://i.stack.imgur.com/kbheK.png" alt="enter image description here"> Notice how there are tiny specs of noise, to remove them we can perform morphological operations <img src="https://i.stack.imgur.com/I2VhU.png" alt="enter image description here"> Finally we invert the image <img src="https://i.stack.imgur.com/x82Iz.png" alt="enter image description here"> Result from Pytesseract OCR <pre class="prettyprint"><code>2HHH </code></pre> Code <pre class="prettyprint"><code>import cv2 import pytesseract pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe" # Grayscale, Gaussian blur, Otsu's threshold image = cv2.imread('1.png') gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) blur = cv2.GaussianBlur(gray, (3,3), 0) thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1] # Morph open to remove noise and invert image kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3)) opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1) invert = 255 - opening # Perform text extraction data = pytesseract.image_to_string(invert, lang='eng', config='--psm 6') print(data) cv2.imshow('thresh', thresh) cv2.imshow('opening', opening) cv2.imshow('invert', invert) cv2.waitKey() </code></pre>

Use pytesseract OCR to recognize text from an image

Tags:

python

image

image-processing

computer-vision

ocr

I need to use Pytesseract to extract text from this picture:

enter image description here

and the code:

from PIL import Image, ImageEnhance, ImageFilter import pytesseract path = 'pic.gif' img = Image.open(path) img = img.convert('RGBA') pix = img.load() for y in range(img.size[1]):     for x in range(img.size[0]):         if pix[x, y][0] < 102 or pix[x, y][1] < 102 or pix[x, y][2] < 102:             pix[x, y] = (0, 0, 0, 255)         else:             pix[x, y] = (255, 255, 255, 255) img.save('temp.jpg') text = pytesseract.image_to_string(Image.open('temp.jpg')) # os.remove('temp.jpg') print(text)

and the "temp.jpg" is

enter image description here

Not bad, but the result of print is ,2 WW Not the right text2HHH, so how can I remove those black dots?

973

asked Jun 10 '16 10:06

Smith John

2 Answers

Here is my solution:

import pytesseract from PIL import Image, ImageEnhance, ImageFilter  im = Image.open("temp.jpg") # the second one  im = im.filter(ImageFilter.MedianFilter()) enhancer = ImageEnhance.Contrast(im) im = enhancer.enhance(2) im = im.convert('1') im.save('temp2.jpg') text = pytesseract.image_to_string(Image.open('temp2.jpg')) print(text)

161

answered Sep 21 '22 21:09

Smith John

Here's a simple approach using OpenCV and Pytesseract OCR. To perform OCR on an image, its important to preprocess the image. The idea is to obtain a processed image where the text to extract is in black with the background in white. To do this, we can convert to grayscale, apply a slight Gaussian blur, then Otsu's threshold to obtain a binary image. From here, we can apply morphological operations to remove noise. Finally we invert the image. We perform text extraction using the --psm 6 configuration option to assume a single uniform block of text. Take a look here for more options.

Here's a visualization of the image processing pipeline:

Input image

enter image description here

Convert to grayscale -> Gaussian blur -> Otsu's threshold

enter image description here

Notice how there are tiny specs of noise, to remove them we can perform morphological operations

enter image description here

Finally we invert the image

enter image description here

Result from Pytesseract OCR

2HHH

Code

import cv2 import pytesseract  pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"  # Grayscale, Gaussian blur, Otsu's threshold image = cv2.imread('1.png') gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) blur = cv2.GaussianBlur(gray, (3,3), 0) thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]  # Morph open to remove noise and invert image kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3)) opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1) invert = 255 - opening  # Perform text extraction data = pytesseract.image_to_string(invert, lang='eng', config='--psm 6') print(data)  cv2.imshow('thresh', thresh) cv2.imshow('opening', opening) cv2.imshow('invert', invert) cv2.waitKey()

answered Sep 22 '22 21:09

nathancy

Related questions
                            
                                How can I convert XML into a Python object?
                            
                                How can I implement decrease-key functionality in Python's heapq?
                            
                                Is it possible to generate a diagram of an entire Django webapp? [closed]
                            
                                What does "print >>" do in python? [duplicate]
                            
                                Function application over numpy's matrix row/column
                            
                                Python newbie @patch decorator issue
                            
                                Sum all columns with a wildcard name search using Python Pandas
                            
                                Keras + Tensorflow and Multiprocessing in Python
                            
                                Mapping columns from one dataframe to another to create a new column [duplicate]
                            
                                Virtualenv specific pip config files
                            
                                What is the fastest way to upload a big csv file in notebook to work with python pandas?
                            
                                How to supply a mock class method for python unit test?
                            
                                List all Tests Found by Nosetest
                            
                                How to dynamically access class properties in Python?
                            
                                byte string vs. unicode string. Python
                            
                                remove virtual environment created with venv in python3
                            
                                How do I store desktop application data in a cross platform way for python?
                            
                                Check for a cookie with Python Flask
                            
                                NumPy/OpenCV 2: how do I crop non-rectangular region?
                            
                                How to find length of dictionary values

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Use pytesseract OCR to recognize text from an image

Tags:

python

image

image-processing

computer-vision

ocr

Smith John

People also ask

2 Answers

Smith John

nathancy

Recent Activity

Donate For Us