Proper image thresholding to prepare it for OCR in python using opencv

Question

I am really new to opencv and a beginner to python.

I have this image:

original bmp 24bit image

I want to somehow apply proper thresholding to keep nothing but the 6 digits.

The bigger picture is that I intend to try to perform manual OCR to the image for each digit separately, using the k-nearest neighbours algorithm on a per digit level (kNearest.findNearest)

The problem is that I cannot clean up the digits sufficiently, especially the '7' digit which has this blue-ish watermark passing through it.

The steps I have tried so far are the following:

I am reading the image from disk

# IMREAD_UNCHANGED is -1
image = cv2.imread(sys.argv[1], cv2.IMREAD_UNCHANGED)

Then I'm keeping only the blue channel to get rid of the blue watermark around digit '7', effectively converting it to a single channel image

image = image[:,:,0] 
# openned with -1 which means as is, 
# so the blue channel is the first in BGR

single channel - red only - image

Then I'm multiplying it a bit to increase contrast between the digits and the background:

image = cv2.multiply(image, 1.5)

multiplied image to increase contrast

Finally I perform Binary+Otsu thresholding:

_,thressed1 = cv2.threshold(image,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)

binary Oahu thresholded image

As you can see the end result is pretty good except for the digit '7' which has kept a lot of noise.

How to improve the end result? Please supply the image example result where possible, it is better to understand than just code snippets alone.

Kinght 金 · Accepted Answer

You can try to medianBlur the gray(blur) image with different kernels(such as 3, 51), divide the blured results, and threshold it. Something like this：

enter image description here

#!/usr/bin/python3
# 2018/09/23 17:29 (CST) 
# (中秋节快乐)
# (Happy Mid-Autumn Festival)

import cv2 
import numpy as np 

fname = "color.png"
bgray = cv2.imread(fname)[...,0]

blured1 = cv2.medianBlur(bgray,3)
blured2 = cv2.medianBlur(bgray,51)
divided = np.ma.divide(blured1, blured2).data
normed = np.uint8(255*divided/divided.max())
th, threshed = cv2.threshold(normed, 100, 255, cv2.THRESH_OTSU)

dst = np.vstack((bgray, blured1, blured2, normed, threshed)) 
cv2.imwrite("dst.png", dst)

The result：

enter image description here

Plutoberth · Answer

Why not just keep values in the image that are above a certain threshold?

Like this:

import cv2
import numpy as np

img = cv2.imread("./a.png")[:,:,0]  # the last readable image

new_img = []
for line in img:
    new_img.append(np.array(list(map(lambda x: 0 if x < 100 else 255, line))))

new_img = np.array(list(map(lambda x: np.array(x), new_img)))

cv2.imwrite("./b.png", new_img)

Looks great:

You could probably play with the threshold even more and get better results.

Proper image thresholding to prepare it for OCR in python using opencv

Tags:

python

image-processing

opencv

ocr

image-thresholding

Nikolas Kazepis

2 Answers

Kinght 金

Plutoberth

Recent Activity

Donate For Us

Proper image thresholding to prepare it for OCR in python using opencv

Tags:

python

image-processing

opencv

ocr

image-thresholding

Nikolas Kazepis

2 Answers

Kinght 金

Plutoberth

Related questions

Recent Activity

Donate For Us