Is there a way to find object that have specific color (for example red rectangle 100px 50px with white text) and then select that object as it is and cut it to new file? Look at the picture below. I'm trying to make a script for selecting data from image, then convert to text and finally write to Excel.
I googled a lot of howtos but didn't find any that address my problem.
Sample image
Hover the mouse pointer over an object or region in your image you would like to select. Selectable objects and regions will be highlighted with an overlay color. To customize the hover overlay, select the gear icon in the Options bar and modify desired settings. Click to automatically select the object or region.
From the image menu bar Tools → Selection Tools → By Color Select, by clicking on the tool icon in the ToolBox, by using the keyboard shortcut Shift +O.
Select the Magic Wand tool in the Tools panel. In the Options bar, uncheck Contiguous if you want to select nonadjacent areas of similar color. Leave Contiguous checked if you want to select only adjacent areas of similar color. Click the color in the image that you want to select.
I don't know your real intention, would you like only read the text or do you like also extract the parts? Anyway, I'm going to show you a straight forward and general solution. Take the parts you need, at the end you find the hole code.
For the hole bunch you need 4 modules:cv2 (openCV)
for image processingnumpy
to handle special operations on the imagespytesseract
to recognize text (ocr)pillow (pil)
to prepare the image for pytesseract
Load und filter
Your original image:
First we reduce all colors except red. lower
and upper
describes the values from BGR (RGB = red, green, blue) we like to filter.
image = cv.imread("AR87t.jpg")
lower = np.array([0, 0, 200])
upper = np.array([100, 100, 255])
shapeMask = cv.inRange(image, lower, upper)
cv.imshow("obj shapeMask", shapeMask)
cv.waitKey(0)
This shows:
finding contours
Next, we find the contours and iterating through. If we find 4 corners, we will do the next stuff...
cnts = cv.findContours(shapeMask.copy(), cv.RETR_EXTERNAL,
cv.CHAIN_APPROX_SIMPLE)[0]
for c in cnts:
peri = cv.arcLength(c, True)
approx = cv.approxPolyDP(c, 0.04 * peri, True)
if len(approx) == 4:
....
mask the original
With boundingRect, we extract x
, y
, w
, h
(x, y, w, h) = cv.boundingRect(approx)
cv.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), thickness=5)
ocr on the mask
And here comes the magic! First we extract the mask parts and export the openCV image to an PIL image. We are then able to run tesseract over.
el = shapeMask.copy()[y:y + h, x:x + w]
pil_im = Image.fromarray(el)
cv.imshow("obj", el)
cv.waitKey(0)
print(pytesseract.image_to_string(pil_im))
this shows you every rectangle as small image. You console will print out:
L2 = 33,33
L3 = 44,44
L1 = 12,22
code
import cv2 as cv
import numpy as np
import pytesseract
from PIL import Image
image = cv.imread("AR87t.jpg")
lower = np.array([0, 0, 200])
upper = np.array([100, 100, 255])
shapeMask = cv.inRange(image, lower, upper)
cv.imshow("obj shapeMask", shapeMask)
cv.waitKey(0)
cnts = cv.findContours(shapeMask.copy(), cv.RETR_EXTERNAL,
cv.CHAIN_APPROX_SIMPLE)[0]
for c in cnts:
peri = cv.arcLength(c, True)
approx = cv.approxPolyDP(c, 0.04 * peri, True)
if len(approx) == 4:
(x, y, w, h) = cv.boundingRect(approx)
cv.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), thickness=5)
print("w:%s, y:%s, w:%s, h:%s" % (x, y, w, h))
el = shapeMask.copy()[y:y + h, x:x + w]
pil_im = Image.fromarray(el)
cv.imshow("obj", el)
cv.waitKey(0)
print(pytesseract.image_to_string(pil_im))
cv.imshow("obj rectangle", image)
cv.waitKey(0)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With