Preprocessing an image for MNIST OCR

Question

I'm busy with an OCR application in python to read digits. I'm using OpenCV to find the contours on an image, crop it, and then preprocess the image to 28x28 for the MNIST dataset. My images are not square, so I seem to lose a lot of quality when I resize the image. Any tips or suggestions I could try?

This is the original image

This is after editing it

And this is the quality it should be

I've tried some tricks from http://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_morphological_ops/py_morphological_ops.html , like Dilation and Opening. But it doesnt make it better, it only makes it vague...

This it the code im using (find contour,crop it, resize it, then threshold, and then i center it)

import numpy as np
import cv2
import imutils
import scipy
from imutils.perspective import four_point_transform
from scipy import ndimage

images = np.zeros((4, 784))
correct_vals = np.zeros((4, 10))

i = 0


def getBestShift(img):
    cy, cx = ndimage.measurements.center_of_mass(img)

    rows, cols = img.shape
    shiftx = np.round(cols / 2.0 - cx).astype(int)
    shifty = np.round(rows / 2.0 - cy).astype(int)

    return shiftx, shifty


def shift(img, sx, sy):
    rows, cols = img.shape
    M = np.float32([[1, 0, sx], [0, 1, sy]])
    shifted = cv2.warpAffine(img, M, (cols, rows))
    return shifted


for no in [1, 3, 4, 5]:
    image = cv2.imread("images/" + str(no) + ".jpg")
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    blurred = cv2.GaussianBlur(gray, (5, 5), 0)
    edged = cv2.Canny(blurred, 50, 200, 255)

    cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
                            cv2.CHAIN_APPROX_SIMPLE)
    cnts = cnts[0] if imutils.is_cv2() else cnts[1]
    cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
    displayCnt = None

    for c in cnts:
        # approximate the contour
        peri = cv2.arcLength(c, True)
        approx = cv2.approxPolyDP(c, 0.02 * peri, True)

        # if the contour has four vertices, then we have found
        # the thermostat display
        if len(approx) == 4:
            displayCnt = approx
            break

    warped = four_point_transform(gray, displayCnt.reshape(4, 2))
    gray = cv2.resize(255 - warped, (28, 28))
    (thresh, gray) = cv2.threshold(gray, 128, 255, cv2.THRESH_BINARY |     cv2.THRESH_OTSU)


    while np.sum(gray[0]) == 0:
        gray = gray[1:]

    while np.sum(gray[:, 0]) == 0:
        gray = np.delete(gray, 0, 1)

    while np.sum(gray[-1]) == 0:
        gray = gray[:-1]

    while np.sum(gray[:, -1]) == 0:
        gray = np.delete(gray, -1, 1)

    rows, cols = gray.shape

    if rows > cols:
        factor = 20.0 / rows
        rows = 20
        cols = int(round(cols * factor))
        gray = cv2.resize(gray, (cols, rows))

    else:
        factor = 20.0 / cols
        cols = 20
        rows = int(round(rows * factor))
        gray = cv2.resize(gray, (cols, rows))

    colsPadding = (int(np.math.ceil((28 - cols) / 2.0)), int(np.math.floor((28 - cols) / 2.0)))
    rowsPadding = (int(np.math.ceil((28 - rows) / 2.0)), int(np.math.floor((28 - rows) / 2.0)))
    gray = np.lib.pad(gray, (rowsPadding, colsPadding), 'constant')

    shiftx, shifty = getBestShift(gray)
    shifted = shift(gray, shiftx, shifty)
    gray = shifted

    cv2.imwrite("processed/" + str(no) + ".png", gray)
    cv2.imshow("imgs", gray)
    cv2.waitKey(0)

Zev · Accepted Answer

When you resize the image, make sure you select the interpolation that best suits your needs. For this, I recommend:

gray = cv2.resize(255 - warped, (28, 28), interpolation=cv2.INTER_AREA)

which results in enter image description here after the rest of your processing.

You can see a comparison of methods here: http://tanbakuchi.com/posts/comparison-of-openv-interpolation-algorithms/ but since there's just a handful, you can try them all out and see what gives the best results. It looks like the default is INTER_LINEAR.

Preprocessing an image for MNIST OCR

Tags:

Casper

1 Answers

Zev

Recent Activity

Donate For Us

Preprocessing an image for MNIST OCR

Tags:

Casper

1 Answers

Zev

Related questions

Recent Activity

Donate For Us