Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

iOS UIImage Binarization for OCR - handling images with varying luminance

I had a C++ binarization routine that I used for later OCR operation. However I found that it produced unnecessary slanting of text. Searching for alternatives I found GPUImage of great value and it solved the slanting issue.

I am using GPUImage code like this to binarize my input images before applying OCR.

However the threshold value does not cover the range of images I get. See two samples from my input images:

enter image description here

enter image description here

I can't handle both with same threshold value. Low value seems to be fine with later, and higher value is fine with first one.

The second image seems to be of special complexity because I never get all the chars to be binarized right, irrespective of what value I set for threshold. On the other hand, my C++ binarization routine seems to do it right, but I don't have much insights to experiment into it like simplistic threshold value in GPUImage.

How should I handle that?

UPDATE:

I tried with GPUImageAverageLuminanceThresholdFilter with default multiplier = 1. It works fine with first image but the second image continues to be problem.

Some more diverse inputs for binarization:

enter image description here

enter image description here

UPDATE II:

After going through this answer by Brad, tried GPUImageAdaptiveThresholdFilter (also incorporating GPUImagePicture because earlier I was only applying it on UIImage).

With this, I got second image binarized perfect. However first one seems to have lot of noise after binarization when I set blur size is 3.0. OCR results in extra characters added. With lower value of blur size, second image loses precision.

Here it is:

+(UIImage *)binarize : (UIImage *) sourceImage
{
    UIImage * grayScaledImg = [self toGrayscale:sourceImage];
    GPUImagePicture *imageSource = [[GPUImagePicture alloc] initWithImage:grayScaledImg];
    GPUImageAdaptiveThresholdFilter *stillImageFilter = [[GPUImageAdaptiveThresholdFilter alloc] init];
    stillImageFilter.blurSize = 3.0;    

    [imageSource addTarget:stillImageFilter];   
    [imageSource processImage];        

    UIImage *imageWithAppliedThreshold = [stillImageFilter imageFromCurrentlyProcessedOutput];
  //  UIImage *destImage = [thresholdFilter imageByFilteringImage:grayScaledImg];
    return imageWithAppliedThreshold;
}
like image 947
Nirav Bhatt Avatar asked Oct 21 '22 01:10

Nirav Bhatt


1 Answers

For a pre processing step you need adaptive thresholding here.

I got these results using opencv grayscale and adaptive thresholding methods. Maybe with an addition of low pass noise filtering (gaussian or median) it should work like a charm.

luminance

diverse

I used provisia (its a ui to help you process images fast) to get the block size I need: 43 for the image you supplied here. The block size may change if you take photo from closer or further. If you want a generic algorithm, you need to develop one that should search for the best size (search until numbers are detected)

EDIT: I just saw the last image. It is untreatably small. Even if you apply the best pre-processing algorithm, you are not going to detect those numbers. Sampling up would not be solution since noises will come around.

like image 126
baci Avatar answered Oct 24 '22 17:10

baci