Unable to segment handwritten characters

Question

I am trying to extract handwritten numbers and alphabet from an image, for that i followed this stackoverflow link. It is working fine for most of the images where letter is written using marker but when i am using image where data is written using Pen it is failing miserably. Need some help to fix this.

Below is my code:

import cv2
import imutils
from imutils import contours

# Load image, grayscale, Otsu's threshold
image = cv2.imread('xxx/pic_crop_7.png')
image = imutils.resize(image, width=350)
img=image.copy()

# Remove border
kernel_vertical = cv2.getStructuringElement(cv2.MORPH_RECT, (1,50))
temp1 = 255 - cv2.morphologyEx(image, cv2.MORPH_CLOSE, kernel_vertical)
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (50,1))
temp2 = 255 - cv2.morphologyEx(image, cv2.MORPH_CLOSE, horizontal_kernel)
temp3 = cv2.add(temp1, temp2)
result = cv2.add(temp3, image)

# Convert to grayscale and Otsu's threshold
gray = cv2.cvtColor(result, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray,(5,5),0)
_,thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY_INV)
# thresh=cv2.dilate(thresh,None,iterations=1)

# Find contours and filter using contour area
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[0]

MIN_AREA=45

digit_contours = []
for c in cnts:
    if cv2.contourArea(c)>MIN_AREA:
        x,y,w,h = cv2.boundingRect(c)
        cv2.rectangle(img, (x, y), (x + w, y + h), (36,255,12), 2)
        digit_contours.append(c)
#         cv2.imwrite("C:/Samples/Dataset/ocr/segmented" + str(i) + ".png", image[y:y+h,x:x+w])


sorted_digit_contours = contours.sort_contours(digit_contours, method='left-to-right')[0]
contour_number = 0
for c in sorted_digit_contours:
    x,y,w,h = cv2.boundingRect(c)
    ROI = image[y:y+h, x:x+w]
    cv2.imwrite('xxx/segment_{}.png'.format(contour_number), ROI)
    contour_number += 1
    
    
cv2.imshow('thresh', thresh)
cv2.imshow('img', img)
cv2.waitKey()

It is correctly able to extract the numbers when written using marker.

Below is an example:

Original Image

Original Image

Correctly extracting charachters

Correctly extracting charachters

Image where it fails to read.

Original Image

enter image description here

Incorrectly Extracting

enter image description here

CodingPeter · Accepted Answer

In this case, you only need to adjust your parameter. Because there is no vertical line in your handwritten characters' background, so I decided to delete them.

# Remove border
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (50,1))
temp2 = 255 - cv2.morphologyEx(image, cv2.MORPH_CLOSE, horizontal_kernel)
result = cv2.add(temp2, image)

And it works.

enter image description here

Unable to segment handwritten characters

Tags:

python

image-processing

opencv

computer-vision

ocr

Ironman

1 Answers

CodingPeter

Recent Activity

Donate For Us

Unable to segment handwritten characters

Tags:

python

image-processing

opencv

computer-vision

ocr

Ironman

1 Answers

CodingPeter

Related questions

Recent Activity

Donate For Us