Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python OpenCV skew correction for OCR

Currently, I am working on an OCR project where I need to read the text off of a label (see example images below). I am running into issues with the image skew and I need help fixing the image skew so the text is horizontal and not at an angle. Currently the process I am using attempts to score different angles from a given range (code included below), but this method is inconsistent and sometimes overcorrects an image skew or flat out fails to identify the skew and correct it. Just as a note, before the skew correction I am rotating all of the images by 270 degrees to get the text upright, then I am passing the image through the code below. The image passed through to the function is already a binary image.

Code:


def findScore(img, angle):
    """
    Generates a score for the binary image recieved dependent on the determined angle.\n
    Vars:\n
    - array <- numpy array of the label\n
    - angle <- predicted angle at which the image is rotated by\n
    Returns:\n
    - histogram of the image
    - score of potential angle
    """
    data = inter.rotate(img, angle, reshape = False, order = 0)
    hist = np.sum(data, axis = 1)
    score = np.sum((hist[1:] - hist[:-1]) ** 2)
    return hist, score

def skewCorrect(img):
    """
    Takes in a nparray and determines the skew angle of the text, then corrects the skew and returns the corrected image.\n
    Vars:\n
    - img <- numpy array of the label\n
    Returns:\n
    - Corrected image as a numpy array\n
    """
    #Crops down the skewImg to determine the skew angle
    img = cv2.resize(img, (0, 0), fx = 0.75, fy = 0.75)

    delta = 1
    limit = 45
    angles = np.arange(-limit, limit+delta, delta)
    scores = []
    for angle in angles:
        hist, score = findScore(img, angle)
        scores.append(score)
    bestScore = max(scores)
    bestAngle = angles[scores.index(bestScore)]
    rotated = inter.rotate(img, bestAngle, reshape = False, order = 0)
    print("[INFO] angle: {:.3f}".format(bestAngle))
    #cv2.imshow("Original", img)
    #cv2.imshow("Rotated", rotated)
    #cv2.waitKey(0)
    
    #Return img
    return rotated

Example images of the label before correction and after

Before correction -> After correction

If anyone can help me figure this problem out, it would be of much help.

like image 773
Peter S Avatar asked Sep 16 '19 21:09

Peter S


2 Answers

To add up to @nathancy answer, for windows users, if you're getting additional skew just add dtype=float. Whenever you create a numpy array. There's a integer overflow issue with windows as it assigns int(32) bit as data type unlike rest of the systems.

See below code; added dtype=float in np.sum() methods:

import cv2
import numpy as np
from scipy.ndimage import interpolation as inter

def correct_skew(image, delta=1, limit=5):
    def determine_score(arr, angle):
        data = inter.rotate(arr, angle, reshape=False, order=0)
        histogram = np.sum(data, axis=1, dtype=float)
        score = np.sum((histogram[1:] - histogram[:-1]) ** 2, dtype=float)
        return histogram, score

    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1] 

    scores = []
    angles = np.arange(-limit, limit + delta, delta)
    for angle in angles:
        histogram, score = determine_score(thresh, angle)
        scores.append(score)

    best_angle = angles[scores.index(max(scores))]

    (h, w) = image.shape[:2]
    center = (w // 2, h // 2)
    M = cv2.getRotationMatrix2D(center, best_angle, 1.0)
    rotated = cv2.warpAffine(image, M, (w, h), flags=cv2.INTER_CUBIC, \
          borderMode=cv2.BORDER_REPLICATE)

    return best_angle, rotated

if __name__ == '__main__':
    image = cv2.imread('1.png')
    angle, rotated = correct_skew(image)
    print(angle)
    cv2.imshow('rotated', rotated)
    cv2.imwrite('rotated.png', rotated)
    cv2.waitKey()
like image 86
full_pr0 Avatar answered Sep 17 '22 01:09

full_pr0


Here's an implementation of the Projection Profile Method to determine skew. After obtaining a binary image, the idea is rotate the image at various angles and generate a histogram of pixels in each iteration. To determine the skew angle, we compare the maximum difference between peaks and using this skew angle, rotate the image to correct the skew


Left (original), Right (corrected)

import cv2
import numpy as np
from scipy.ndimage import interpolation as inter

def correct_skew(image, delta=1, limit=5):
    def determine_score(arr, angle):
        data = inter.rotate(arr, angle, reshape=False, order=0)
        histogram = np.sum(data, axis=1)
        score = np.sum((histogram[1:] - histogram[:-1]) ** 2)
        return histogram, score

    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1] 

    scores = []
    angles = np.arange(-limit, limit + delta, delta)
    for angle in angles:
        histogram, score = determine_score(thresh, angle)
        scores.append(score)

    best_angle = angles[scores.index(max(scores))]

    (h, w) = image.shape[:2]
    center = (w // 2, h // 2)
    M = cv2.getRotationMatrix2D(center, best_angle, 1.0)
    rotated = cv2.warpAffine(image, M, (w, h), flags=cv2.INTER_CUBIC, \
              borderMode=cv2.BORDER_REPLICATE)

    return best_angle, rotated

if __name__ == '__main__':
    image = cv2.imread('1.png')
    angle, rotated = correct_skew(image)
    print(angle)
    cv2.imshow('rotated', rotated)
    cv2.imwrite('rotated.png', rotated)
    cv2.waitKey()
like image 24
nathancy Avatar answered Sep 21 '22 01:09

nathancy