I have this type of image from that I only want to extract the characters.
After binarization, I am getting this image
img = cv2.imread('the_image.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thresh = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 9)
Then find contours on this image.
(im2, cnts, _) = cv2.findContours(thresh.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
for contour in cnts[:2000]:
x, y, w, h = cv2.boundingRect(contour)
aspect_ratio = h/w
area = cv2.contourArea(contour)
cv2.drawContours(img, [contour], -1, (0, 255, 0), 2)
I am getting
I need a way to filter the contours so that it selects only the characters. So I can find the bounding boxes and extract roi.
I can find contours and filter them based on the size of areas, but the resolution of the source images are not consistent. These images are taken from mobile cameras.
Also as the borders of the boxes are disconnected. I can't accurately detect the boxes.
Edit:
If I deselect boxes which has an aspect ratio less than 0.4. Then it works up to some extent. But I don't know if it will work or not for different resolution of images.
for contour in cnts[:2000]:
x, y, w, h = cv2.boundingRect(contour)
aspect_ratio = h/w
area = cv2.contourArea(contour)
if aspect_ratio < 0.4:
continue
print(aspect_ratio)
cv2.drawContours(img, [contour], -1, (0, 255, 0), 2)
Not so difficult...
import cv2
img = cv2.imread('img.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow('gray', gray)
ret, thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU)
cv2.imshow('thresh', thresh)
im2, ctrs, hier = cv2.findContours(thresh.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
sorted_ctrs = sorted(ctrs, key=lambda ctr: cv2.boundingRect(ctr)[0])
for i, ctr in enumerate(sorted_ctrs):
x, y, w, h = cv2.boundingRect(ctr)
roi = img[y:y + h, x:x + w]
area = w*h
if 250 < area < 900:
rect = cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)
cv2.imshow('rect', rect)
cv2.waitKey(0)
Result
You can tweak the code like you want (here it can save ROI using original image; for eventually OCR recognition you have to save them in binary format - better methods than sorting by area are available)
Source: Extract ROI from image with Python and OpenCV and some of my knowledge.
Just kidding, take a look at my questions/answers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With