Filtering Image For Improving Text Recognition

Tags:

I have this source image below (after cropped) and I try to do some image processing before I read text.

With python and opencv, I tried to remove the lines in the background with k-means with k =2, and the result is

I tried to smooth the image using this code below

def process_image_for_ocr(file_path):
# TODO : Implement using opencv
temp_filename = set_image_dpi(file_path)
im_new = remove_noise_and_smooth(temp_filename)
return im_new


def set_image_dpi(file_path):
    im = Image.open(file_path)
    length_x, width_y = im.size
    factor = max(1, int(IMAGE_SIZE / length_x))
    size = factor * length_x, factor * width_y
    # size = (1800, 1800)
    im_resized = im.resize(size, Image.ANTIALIAS)
    temp_file = tempfile.NamedTemporaryFile(delete=False, suffix='.jpg')
    temp_filename = temp_file.name
    im_resized.save(temp_filename, dpi=(300, 300))
    return temp_filename


def image_smoothening(img):
    ret1, th1 = cv2.threshold(img, BINARY_THREHOLD, 255, cv2.THRESH_BINARY)
    ret2, th2 = cv2.threshold(th1, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
    blur = cv2.GaussianBlur(th2, (1, 1), 0)
    ret3, th3 = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
    return th3


def remove_noise_and_smooth(file_name):
    img = cv2.imread(file_name, 0)
    filtered = cv2.adaptiveThreshold(img.astype(np.uint8), 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 41, 3)
    kernel = np.ones((1, 1), np.uint8)
    opening = cv2.morphologyEx(filtered, cv2.MORPH_OPEN, kernel)
    closing = cv2.morphologyEx(opening, cv2.MORPH_CLOSE, kernel)
    img = image_smoothening(img)
    or_image = cv2.bitwise_or(img, closing)
    return or_image

And the result is

Can you help me (any idea) to remove the lines on the background of the source image?

933

asked Jul 31 '18 06:07

S. Hersister

1 Answers

One approach to achieve this is by computing a k-means unsupervised segmentation of the image. You just need to play with the k and i_val values to get the desired output.

First, you need to create a function which will find the k threshold values.This simply calculates an image histogram which is used to compute the k_means. .ravel() just converts your numpy array to a 1-D array. np.reshape(img, (-1,1)) then converts it to an 2-D array which is of shape n,1. Next we carry out the k_means as described here.

The function takes the input gray-scale image, your number of k intervals and the value you want to threshold from (i_val). It returns the threshold value at your desired i_val.

def kmeans(input_img, k, i_val):
    hist = cv2.calcHist([input_img],[0],None,[256],[0,256])
    img = input_img.ravel()
    img = np.reshape(img, (-1, 1))
    img = img.astype(np.float32)

    criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
    flags = cv2.KMEANS_RANDOM_CENTERS
    compactness,labels,centers = cv2.kmeans(img,k,None,criteria,10,flags)
    centers = np.sort(centers, axis=0)

    return centers[i_val].astype(int), centers, hist

img = cv2.imread('Y8CSE.jpg', 0)
_, thresh = cv2.threshold(img, kmeans(input_img=img, k=8, i_val=2)[0], 255, cv2.THRESH_BINARY)
cv2.imwrite('text.png',thresh)

The output for this looks like:

K-MEANS threshold

You could carry on with this method by using morphological operators, or pre-mask the image using a hough transform as seen in the first answer here.

answered Oct 12 '22 02:10

D.Griffiths

Related questions
                            
                                Querying a Partitioned table in BigQuery using a reference from a joined table
                            
                                What is in the sub and oid claims when getting client_credentials tokens from the Azure AD OAuth v2 token endpoint?
                            
                                Angular: Can I use translate with async pipe?
                            
                                Bitbucket Pipelines: gcloud crashed (UnicodeDecodeError)
                            
                                java8 stream of arrays to 2 dimensional array
                            
                                How to detect tables in images using tesseract 4.0 or using pytesseract? [closed]
                            
                                What is the best folder structure for ngrx when lazy loading is used Angular 6?
                            
                                Understanding behavior of Python imports and circular dependencies
                            
                                Detect broken image when using hydrate instead of render from react-dom
                            
                                able to return nested dictionary using values? django
                            
                                Cannot get Jenkins to update GitHub Pull Request with a build status
                            
                                Error displaying video stream using Opencv on raspberry pi

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Filtering Image For Improving Text Recognition

Tags:

python

image-processing

opencv

text-recognition

S. Hersister

People also ask

1 Answers

D.Griffiths

Recent Activity

Donate For Us