Extracting text from an image with tesseract [duplicate]

Question

I asked a similar question here but that is focused more on tesseract.

I have a sample image as below. I would like to make the white square my Region of Interest and then crop out that part (square) and create a new image with it. I will be working with different images so the square won't always be at the same location in all images. So I will need to somehow detect the edges of the square.

enter image description here

What are some pre-processing methods I can perform to achieve the result?

karlphillip · Accepted Answer

Using your test image I was able to remove all the noises with a simple erosion operation.

After that, a simple iteration on the Mat to find for the corner pixels is trivial, and I talked about that on this answer. For testing purposes we can draw green lines between those points to display the area we are interested at in the original image:

At the end, I set the ROI in the original image and crop out that part.

The final result is displayed on the image below:

I wrote a sample code that performs this task using the C++ interface of OpenCV. I'm confident in your skills to translate this code to Python. If you can't do it, forget the code and stick with the roadmap I shared on this answer.

#include <cv.h>
#include <highgui.h>

int main(int argc, char* argv[])
{
    cv::Mat img = cv::imread(argv[1]);
    std::cout << "Original image size: " << img.size() << std::endl;

    // Convert RGB Mat to GRAY
    cv::Mat gray;
    cv::cvtColor(img, gray, CV_BGR2GRAY);
    std::cout << "Gray image size: " << gray.size() << std::endl;

    // Erode image to remove unwanted noises
    int erosion_size = 5;
    cv::Mat element = cv::getStructuringElement(cv::MORPH_CROSS,
                                       cv::Size(2 * erosion_size + 1, 2 * erosion_size + 1),
                                       cv::Point(erosion_size, erosion_size) );
    cv::erode(gray, gray, element);

    // Scan the image searching for points and store them in a vector
    std::vector<cv::Point> points;
    cv::Mat_<uchar>::iterator it = gray.begin<uchar>();
    cv::Mat_<uchar>::iterator end = gray.end<uchar>();
    for (; it != end; it++)
    {
        if (*it) 
            points.push_back(it.pos()); 
    }

    // From the points, figure out the size of the ROI
    int left, right, top, bottom;
    for (int i = 0; i < points.size(); i++)
    {
        if (i == 0) // initialize corner values
        {
            left = right = points[i].x;
            top = bottom = points[i].y;
        }

        if (points[i].x < left)
            left = points[i].x;

        if (points[i].x > right)
            right = points[i].x;

        if (points[i].y < top)
            top = points[i].y;

        if (points[i].y > bottom)
            bottom = points[i].y;
    }
    std::vector<cv::Point> box_points;
    box_points.push_back(cv::Point(left, top));
    box_points.push_back(cv::Point(left, bottom));
    box_points.push_back(cv::Point(right, bottom));
    box_points.push_back(cv::Point(right, top));

    // Compute minimal bounding box for the ROI
    // Note: for some unknown reason, width/height of the box are switched.
    cv::RotatedRect box = cv::minAreaRect(cv::Mat(box_points));
    std::cout << "box w:" << box.size.width << " h:" << box.size.height << std::endl;

    // Draw bounding box in the original image (debugging purposes)
    //cv::Point2f vertices[4];
    //box.points(vertices);
    //for (int i = 0; i < 4; ++i)
    //{
    //    cv::line(img, vertices[i], vertices[(i + 1) % 4], cv::Scalar(0, 255, 0), 1, CV_AA);
    //}
    //cv::imshow("Original", img);
    //cv::waitKey(0);

    // Set the ROI to the area defined by the box
    // Note: because the width/height of the box are switched, 
    // they were switched manually in the code below:
    cv::Rect roi;
    roi.x = box.center.x - (box.size.height / 2);
    roi.y = box.center.y - (box.size.width / 2);
    roi.width = box.size.height;
    roi.height = box.size.width;
    std::cout << "roi @ " << roi.x << "," << roi.y << " " << roi.width << "x" << roi.height << std::endl;

    // Crop the original image to the defined ROI
    cv::Mat crop = img(roi);

    // Display cropped ROI
    cv::imshow("Cropped ROI", crop);
    cv::waitKey(0);

    return 0;
}

HugoRune · Answer

Seeing that the text is the only large blob, and everything else is barely larger than a pixel, a simple morphological opening should suffice

You can do this in opencv or with imagemagic

Afterwards the white rectangle should be the only thing left in the image. You can find it with opencvs findcontours, with the CvBlobs library for opencv or with the imagemagick -crop function

Here is your image with 2 steps of erosion followed by 2 steps of dilation applied: enter image description here You can simply plug this image into the opencv findContours function as in the Squares tutorial example to get the position

Extracting text from an image with tesseract [duplicate]

Tags:

c++

python

image-processing

opencv

numpy

birdy

2 Answers

karlphillip

HugoRune

Recent Activity

Donate For Us

Extracting text from an image with tesseract [duplicate]

Tags:

c++

python

image-processing

opencv

numpy

birdy

2 Answers

karlphillip

HugoRune

Related questions

Recent Activity

Donate For Us