Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Image preprocessing for text recognition

What's the best set of image preprocessing operations to apply to images for text recognition in EmguCV?

I've included two sample images here.

Applying a low or high pass filter won't be suitable, as the text may be of any size. I've tried median and bilateral filters, but they don't seem to affect the image much.

The ideal result would be a binary image with all the text white, and most of the rest black. This image would then be sent to the OCR engine.

Thanks

like image 546
Osiris Avatar asked Jul 13 '12 05:07

Osiris


2 Answers

There's nothing like the best set. Keep in mind that digital images can be acquired by different capture devices and each device can embed its own preprocessing system (filters) and other characteristics that can drastically change the image and even add noises to them. So every case would have to be treated (preprocessed) differently.

However, there are commmon operations that can be used to improve the detection, for instance, a very basic one would be to convert the image to grayscale and apply a threshold to binarize the image. Another technique I've used before is the bounding box, which allows you to detect the text region. To remove noises from images you might be interested in erode/dilate operations. I demonstrate some of these operations on this post.

Also, there are other interesting posts about OCR and OpenCV that you should take a look:

  • Simple Digit Recognition OCR in OpenCV-Python
  • Basic OCR in OpenCV

Now, just to show you a simple approach that can be used with your sample image, this is the result of inverting the color and applying a threshold:

cv::Mat new_img = cv::imread(argv[1]);
cv::bitwise_not(new_img, new_img);

double thres = 100;
double color = 255;
cv::threshold(new_img, new_img, thres, color, CV_THRESH_BINARY);

cv::imwrite("inv_thres.png", new_img);
like image 152
karlphillip Avatar answered Oct 12 '22 23:10

karlphillip


Try morphological image processing. Have a look at this. However, it works only on binary images - so you will have to binarize the image( threshold?). Although, it is simple, it is dependent on font size, so one structure element will not work for all font sizes. If you want a generic solution, there are a number of papers for text detection in images - A search of this term in google scholar should provide you with some useful publications.

like image 39
go4sri Avatar answered Oct 13 '22 00:10

go4sri