Remove background noise from image to make text more clear for OCR

Tags:

I've written an application that segments an image based on the text regions within it, and extracts those regions as I see fit. What I'm attempting to do is clean the image so OCR (Tesseract) gives an accurate result. I have the following image as an example:

enter image description here

Running this through tesseract gives a widely inaccurate result. However cleaning up the image (using photoshop) to get the image as follows:

enter image description here

Gives exactly the result I would expect. The first image is already being run through the following method to clean it to that point:

 public Mat cleanImage (Mat srcImage) {     Core.normalize(srcImage, srcImage, 0, 255, Core.NORM_MINMAX);     Imgproc.threshold(srcImage, srcImage, 0, 255, Imgproc.THRESH_OTSU);     Imgproc.erode(srcImage, srcImage, new Mat());     Imgproc.dilate(srcImage, srcImage, new Mat(), new Point(0, 0), 9);     return srcImage; }

What more can I do to clean the first image so it resembles the second image?

Edit: This is the original image before it's run through the cleanImage function.

enter image description here

399

asked Nov 23 '15 21:11

Zy0n

Video Answer

1 Answers

My answer is based on following assumptions. It's possible that none of them holds in your case.

It's possible for you to impose a threshold for bounding box heights in the segmented region. Then you should be able to filter out other components.
You know the average stroke widths of the digits. Use this information to minimize the chance that the digits are connected to other regions. You can use distance transform and morphological operations for this.

This is my procedure for extracting the digits:

Apply Otsu threshold to the image
Take the distance transform
Threshold the distance transformed image using the stroke-width ( = 8) constraint
Apply morphological operation to disconnect
Filter bounding box heights and make a guess where the digits are

stroke-width = 8 stroke-width = 10 bb2

EDIT

Prepare a mask using the convexhull of the found digit contours
Copy digits region to a clean image using the mask

stroke-width = 8 cl1

stroke-width = 10 cl2

My Tesseract knowledge is a bit rusty. As I remember you can get a confidence level for the characters. You may be able to filter out noise using this information if you still happen to detect noisy regions as character bounding boxes.

C++ Code

Mat im = imread("aRh8C.png", 0); // apply Otsu threshold Mat bw; threshold(im, bw, 0, 255, CV_THRESH_BINARY_INV | CV_THRESH_OTSU); // take the distance transform Mat dist; distanceTransform(bw, dist, CV_DIST_L2, CV_DIST_MASK_PRECISE); Mat dibw; // threshold the distance transformed image double SWTHRESH = 8;    // stroke width threshold threshold(dist, dibw, SWTHRESH/2, 255, CV_THRESH_BINARY); Mat kernel = getStructuringElement(MORPH_RECT, Size(3, 3)); // perform opening, in case digits are still connected Mat morph; morphologyEx(dibw, morph, CV_MOP_OPEN, kernel); dibw.convertTo(dibw, CV_8U); // find contours and filter Mat cont; morph.convertTo(cont, CV_8U);  Mat binary; cvtColor(dibw, binary, CV_GRAY2BGR);  const double HTHRESH = im.rows * .5;    // height threshold vector<vector<Point>> contours; vector<Vec4i> hierarchy; vector<Point> digits; // points corresponding to digit contours  findContours(cont, contours, hierarchy, CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE, Point(0, 0)); for(int idx = 0; idx >= 0; idx = hierarchy[idx][0]) {     Rect rect = boundingRect(contours[idx]);     if (rect.height > HTHRESH)     {         // append the points of this contour to digit points         digits.insert(digits.end(), contours[idx].begin(), contours[idx].end());          rectangle(binary,              Point(rect.x, rect.y), Point(rect.x + rect.width - 1, rect.y + rect.height - 1),             Scalar(0, 0, 255), 1);     } }  // take the convexhull of the digit contours vector<Point> digitsHull; convexHull(digits, digitsHull); // prepare a mask vector<vector<Point>> digitsRegion; digitsRegion.push_back(digitsHull); Mat digitsMask = Mat::zeros(im.rows, im.cols, CV_8U); drawContours(digitsMask, digitsRegion, 0, Scalar(255, 255, 255), -1); // expand the mask to include any information we lost in earlier morphological opening morphologyEx(digitsMask, digitsMask, CV_MOP_DILATE, kernel); // copy the region to get a cleaned image Mat cleaned = Mat::zeros(im.rows, im.cols, CV_8U); dibw.copyTo(cleaned, digitsMask);

EDIT

Java Code

Mat im = Highgui.imread("aRh8C.png", 0); // apply Otsu threshold Mat bw = new Mat(im.size(), CvType.CV_8U); Imgproc.threshold(im, bw, 0, 255, Imgproc.THRESH_BINARY_INV | Imgproc.THRESH_OTSU); // take the distance transform Mat dist = new Mat(im.size(), CvType.CV_32F); Imgproc.distanceTransform(bw, dist, Imgproc.CV_DIST_L2, Imgproc.CV_DIST_MASK_PRECISE); // threshold the distance transform Mat dibw32f = new Mat(im.size(), CvType.CV_32F); final double SWTHRESH = 8.0;    // stroke width threshold Imgproc.threshold(dist, dibw32f, SWTHRESH/2.0, 255, Imgproc.THRESH_BINARY); Mat dibw8u = new Mat(im.size(), CvType.CV_8U); dibw32f.convertTo(dibw8u, CvType.CV_8U);  Mat kernel = Imgproc.getStructuringElement(Imgproc.MORPH_RECT, new Size(3, 3)); // open to remove connections to stray elements Mat cont = new Mat(im.size(), CvType.CV_8U); Imgproc.morphologyEx(dibw8u, cont, Imgproc.MORPH_OPEN, kernel); // find contours and filter based on bounding-box height final double HTHRESH = im.rows() * 0.5; // bounding-box height threshold List<MatOfPoint> contours = new ArrayList<MatOfPoint>(); List<Point> digits = new ArrayList<Point>();    // contours of the possible digits Imgproc.findContours(cont, contours, new Mat(), Imgproc.RETR_CCOMP, Imgproc.CHAIN_APPROX_SIMPLE); for (int i = 0; i < contours.size(); i++) {     if (Imgproc.boundingRect(contours.get(i)).height > HTHRESH)     {         // this contour passed the bounding-box height threshold. add it to digits         digits.addAll(contours.get(i).toList());     }    } // find the convexhull of the digit contours MatOfInt digitsHullIdx = new MatOfInt(); MatOfPoint hullPoints = new MatOfPoint(); hullPoints.fromList(digits); Imgproc.convexHull(hullPoints, digitsHullIdx); // convert hull index to hull points List<Point> digitsHullPointsList = new ArrayList<Point>(); List<Point> points = hullPoints.toList(); for (Integer i: digitsHullIdx.toList()) {     digitsHullPointsList.add(points.get(i)); } MatOfPoint digitsHullPoints = new MatOfPoint(); digitsHullPoints.fromList(digitsHullPointsList); // create the mask for digits List<MatOfPoint> digitRegions = new ArrayList<MatOfPoint>(); digitRegions.add(digitsHullPoints); Mat digitsMask = Mat.zeros(im.size(), CvType.CV_8U); Imgproc.drawContours(digitsMask, digitRegions, 0, new Scalar(255, 255, 255), -1); // dilate the mask to capture any info we lost in earlier opening Imgproc.morphologyEx(digitsMask, digitsMask, Imgproc.MORPH_DILATE, kernel); // cleaned image ready for OCR Mat cleaned = Mat.zeros(im.size(), CvType.CV_8U); dibw8u.copyTo(cleaned, digitsMask); // feed cleaned to Tesseract

answered Oct 10 '22 09:10

dhanushka

Related questions
                            
                                Android SDK manager throw Exception with Java 9 [duplicate]
                            
                                Should Java method arguments be used to return multiple values?
                            
                                Elegant ways to handle database views on hibernate entities?
                            
                                How to verify a jar signed with jarsigner programmatically
                            
                                How to generate @XmlRootElement Classes for Base Types in XSD?
                            
                                Multithreaded quicksort or mergesort
                            
                                Is there a way to list NetBeans editor hints?
                            
                                How to implement PriorityBlockingQueue with ThreadPoolExecutor and custom tasks
                            
                                Why isn't LinkedList.Clear() O(1)
                            
                                Java: JOOQ persistence framework performance and feed back [closed]
                            
                                Convert ArrayList<Byte> into a byte[] [duplicate]
                            
                                How to apply a servlet filter only to requests with HTTP POST method
                            
                                Java JDBC: dates consistently two days off
                            
                                Java Pattern Matcher: create new or reset?
                            
                                Negative sign in case of zero in java
                            
                                Explain the timing causing HashMap.put() to execute an infinite loop
                            
                                How to implement "equals" method for generics using "instanceof"?
                            
                                Which method is overridden? [duplicate]
                            
                                Why does HikariCP recommend fixed size pool for better performance
                            
                                JavaFx Drag and Drop a file INTO a program

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Remove background noise from image to make text more clear for OCR

Tags:

java

c++

opencv

ocr

Zy0n

People also ask

Video Answer

1 Answers

dhanushka

Recent Activity

Donate For Us