I'm trying to figure out what technologies I would need to process images for characters. Specifically, in this example, I need to extract the hashtag that is circled. You can see it here: <img src="https://i.stack.imgur.com/yR4CS.png" alt="enter image description here"> Any implementations would be of great assistance.

<h3>It is possible to solve this problem with OpenCV + Tesseract </h3> though I think there might be easier ways. OpenCV is an open source library used to build computer vision applications and Tesseract is an open source OCR engine. Before we start, let me clarify something: that is not a circle, its a rounded rectangle. I'm sharing the source code of the application that I wrote to demonstrate how the problem can be solved, as well as some tips on what's going on. This answer is not supposed to educate anybody on digital image processing and it is expected the reader to have a minimal understanding on this field. I will describe very briefly what the larger sections of the code does. Most of the next chunk of code came from squares.cpp, a sample application that is shipped with OpenCV to detect squares in images. <pre class="prettyprint"><code>#include <iostream> #include <vector> #include <opencv2/highgui/highgui.hpp> #include <opencv2/imgproc/imgproc.hpp> // angle: helper function. // Finds a cosine of angle between vectors from pt0->pt1 and from pt0->pt2. double angle( cv::Point pt1, cv::Point pt2, cv::Point pt0 ) { double dx1 = pt1.x - pt0.x; double dy1 = pt1.y - pt0.y; double dx2 = pt2.x - pt0.x; double dy2 = pt2.y - pt0.y; return (dx1*dx2 + dy1*dy2)/sqrt((dx1*dx1 + dy1*dy1)*(dx2*dx2 + dy2*dy2) + 1e-10); } // findSquares: returns sequence of squares detected on the image. // The sequence is stored in the specified memory storage. void findSquares(const cv::Mat& image, std::vector<std::vector<cv::Point> >& squares) { cv::Mat pyr, timg; // Down-scale and up-scale the image to filter out small noises cv::pyrDown(image, pyr, cv::Size(image.cols/2, image.rows/2)); cv::pyrUp(pyr, timg, image.size()); // Apply Canny with a threshold of 50 cv::Canny(timg, timg, 0, 50, 5); // Dilate canny output to remove potential holes between edge segments cv::dilate(timg, timg, cv::Mat(), cv::Point(-1,-1)); // find contours and store them all as a list std::vector<std::vector<cv::Point> > contours; cv::findContours(timg, contours, CV_RETR_LIST, CV_CHAIN_APPROX_SIMPLE); for( size_t i = 0; i < contours.size(); i++ ) // Test each contour { // Approximate contour with accuracy proportional to the contour perimeter std::vector<cv::Point> approx; cv::approxPolyDP(cv::Mat(contours[i]), approx, cv::arcLength(cv::Mat(contours[i]), true)*0.02, true); // Square contours should have 4 vertices after approximation // relatively large area (to filter out noisy contours) // and be convex. // Note: absolute value of an area is used because // area may be positive or negative - in accordance with the // contour orientation if( approx.size() == 4 && fabs(cv::contourArea(cv::Mat(approx))) > 1000 && cv::isContourConvex(cv::Mat(approx)) ) { double maxCosine = 0; for (int j = 2; j < 5; j++) { // Find the maximum cosine of the angle between joint edges double cosine = fabs(angle(approx[j%4], approx[j-2], approx[j-1])); maxCosine = MAX(maxCosine, cosine); } // If cosines of all angles are small // (all angles are ~90 degree) then write quandrange // vertices to resultant sequence if( maxCosine < 0.3 ) squares.push_back(approx); } } } // drawSquares: function draws all the squares found in the image void drawSquares( cv::Mat& image, const std::vector<std::vector<cv::Point> >& squares ) { for( size_t i = 0; i < squares.size(); i++ ) { const cv::Point* p = &squares[i][0]; int n = (int)squares[i].size(); cv::polylines(image, &p, &n, 1, true, cv::Scalar(0,255,0), 2, CV_AA); } cv::imshow("drawSquares", image); } </code></pre> Ok, so our program begins at: <pre class="prettyprint"><code>int main(int argc, char* argv[]) { // Load input image (colored, 3-channel) cv::Mat input = cv::imread(argv[1]); if (input.empty()) { std::cout << "!!! failed imread()" << std::endl; return -1; } // Convert input image to grayscale (1-channel) cv::Mat grayscale = input.clone(); cv::cvtColor(input, grayscale, cv::COLOR_BGR2GRAY); //cv::imwrite("gray.png", grayscale); </code></pre> What grayscale looks like: <img src="https://i.stack.imgur.com/PBbPd.png" width="200" height="140"> <pre class="prettyprint"><code>// Threshold to binarize the image and get rid of the shoe cv::Mat binary; cv::threshold(grayscale, binary, 225, 255, cv::THRESH_BINARY_INV); cv::imshow("Binary image", binary); //cv::imwrite("binary.png", binary); </code></pre> What binary looks like: <img src="https://i.stack.imgur.com/J01V9.png" width="200" height="140"> <pre class="prettyprint"><code>// Find the contours in the thresholded image std::vector<std::vector<cv::Point> > contours; cv::findContours(binary, contours, cv::RETR_LIST, cv::CHAIN_APPROX_SIMPLE); // Fill the areas of the contours with BLUE (hoping to erase everything inside a rectangular shape) cv::Mat blue = input.clone(); for (size_t i = 0; i < contours.size(); i++) { std::vector<cv::Point> cnt = contours[i]; double area = cv::contourArea(cv::Mat(cnt)); //std::cout << "* Area: " << area << std::endl; cv::drawContours(blue, contours, i, cv::Scalar(255, 0, 0), CV_FILLED, 8, std::vector<cv::Vec4i>(), 0, cv::Point() ); } cv::imshow("Countours Filled", blue); //cv::imwrite("contours.png", blue); </code></pre> What blue looks like: <img src="https://i.stack.imgur.com/oU3BP.png" width="200" height="140"> <pre class="prettyprint"><code>// Convert the blue colored image to binary (again), and we will have a good rectangular shape to detect cv::Mat gray; cv::cvtColor(blue, gray, cv::COLOR_BGR2GRAY); cv::threshold(gray, binary, 225, 255, cv::THRESH_BINARY_INV); cv::imshow("binary2", binary); //cv::imwrite("binary2.png", binary); </code></pre> What binary looks like at this point: <img src="https://i.stack.imgur.com/LXKI5.png" width="200" height="140"> <pre class="prettyprint"><code>// Erode & Dilate to isolate segments connected to nearby areas int erosion_type = cv::MORPH_RECT; int erosion_size = 5; cv::Mat element = cv::getStructuringElement(erosion_type, cv::Size(2 * erosion_size + 1, 2 * erosion_size + 1), cv::Point(erosion_size, erosion_size)); cv::erode(binary, binary, element); cv::dilate(binary, binary, element); cv::imshow("Morphologic Op", binary); //cv::imwrite("morpho.png", binary); </code></pre> What binary looks like at this point: <img src="https://i.stack.imgur.com/HFBOo.png" width="200" height="140"> <pre class="prettyprint"><code>// Ok, let's go ahead and try to detect all rectangular shapes std::vector<std::vector<cv::Point> > squares; findSquares(binary, squares); std::cout << "* Rectangular shapes found: " << squares.size() << std::endl; // Draw all rectangular shapes found cv::Mat output = input.clone(); drawSquares(output, squares); //cv::imwrite("output.png", output); </code></pre> What output looks like: <img src="https://i.stack.imgur.com/drJ1O.png" width="200" height="140"> Alright! We solved the first part of the problem which was finding the rounded rectangle. You can see in the image above that the rectangular shape was detected and green lines were drawn over the original image for educational purposes. The second part is much easier. It begins by creating a ROI (Region of Interested) in the original image so we can crop the image to the area inside the rounded rectangle. Once this is done, the cropped image is saved on the disk as a TIFF file, which is then feeded to Tesseract do it's magic: <pre class="prettyprint"><code>// Crop the rectangular shape if (squares.size() == 1) { cv::Rect box = cv::boundingRect(cv::Mat(squares[0])); std::cout << "* The location of the box is x:" << box.x << " y:" << box.y << " " << box.width << "x" << box.height << std::endl; // Crop the original image to the defined ROI cv::Mat crop = input(box); cv::imshow("crop", crop); //cv::imwrite("cropped.tiff", crop); } else { std::cout << "* Abort! More than one rectangle was found." << std::endl; } // Wait until user presses key cv::waitKey(0); return 0; } </code></pre> What crop looks like: <img src="https://i.stack.imgur.com/Nqu9F.png" width="117" height="46"> When this application finishes it's job, it creates a file named <code>cropped.tiff</code> on the disk. Go to the command-line and invoke Tesseract to detect the text present on the cropped image: <pre class="prettyprint"><code>tesseract cropped.tiff out </code></pre> This command creates a file named <code>out.txt</code> with the detected text: <img src="https://i.stack.imgur.com/pzYqB.png" alt="enter image description here"> Tesseract has an API that you can use to add the OCR feature into your application. This solution is not robust and you will probably have to do some changes here and there to make it work for other test cases.

There is a few alternatives: Java OCR implementation They mention the next tools: <ul> <li>java ocr http://sourceforge.net/projects/javaocr/ </li> <li>aspire http://asprise.com/home/ </li> <li>Java Object Oriented Neural Engine http://www.jooneworld.com/ </li> <li>Ron Cemer Java OCR http://www.roncemer.com/software-development/java-ocr </li> </ul> And a few others. This list of links can also be useful: http://www.javawhat.com/showCategory.do?id=2138003 Generally this kind of task requires lots of trial and testing. Probably the best tool depends much more the profile of your input data than anything else.

Image processing and extraction of characters

2 Answers

It is possible to solve this problem with OpenCV + Tesseract

though I think there might be easier ways. OpenCV is an open source library used to build computer vision applications and Tesseract is an open source OCR engine.

Before we start, let me clarify something: that is not a circle, its a rounded rectangle.

I'm sharing the source code of the application that I wrote to demonstrate how the problem can be solved, as well as some tips on what's going on. This answer is not supposed to educate anybody on digital image processing and it is expected the reader to have a minimal understanding on this field.

I will describe very briefly what the larger sections of the code does. Most of the next chunk of code came from squares.cpp, a sample application that is shipped with OpenCV to detect squares in images.

#include <iostream> #include <vector>  #include <opencv2/highgui/highgui.hpp> #include <opencv2/imgproc/imgproc.hpp>  // angle: helper function. // Finds a cosine of angle between vectors from pt0->pt1 and from pt0->pt2. double angle( cv::Point pt1, cv::Point pt2, cv::Point pt0 ) {     double dx1 = pt1.x - pt0.x;     double dy1 = pt1.y - pt0.y;     double dx2 = pt2.x - pt0.x;     double dy2 = pt2.y - pt0.y;     return (dx1*dx2 + dy1*dy2)/sqrt((dx1*dx1 + dy1*dy1)*(dx2*dx2 + dy2*dy2) + 1e-10); }  // findSquares: returns sequence of squares detected on the image. // The sequence is stored in the specified memory storage. void findSquares(const cv::Mat& image, std::vector<std::vector<cv::Point> >& squares) {       cv::Mat pyr, timg;      // Down-scale and up-scale the image to filter out small noises     cv::pyrDown(image, pyr, cv::Size(image.cols/2, image.rows/2));     cv::pyrUp(pyr, timg, image.size());      // Apply Canny with a threshold of 50     cv::Canny(timg, timg, 0, 50, 5);      // Dilate canny output to remove potential holes between edge segments     cv::dilate(timg, timg, cv::Mat(), cv::Point(-1,-1));      // find contours and store them all as a list      std::vector<std::vector<cv::Point> > contours;                cv::findContours(timg, contours, CV_RETR_LIST, CV_CHAIN_APPROX_SIMPLE);      for( size_t i = 0; i < contours.size(); i++ ) // Test each contour     {         // Approximate contour with accuracy proportional to the contour perimeter         std::vector<cv::Point> approx;            cv::approxPolyDP(cv::Mat(contours[i]), approx, cv::arcLength(cv::Mat(contours[i]), true)*0.02, true);          // Square contours should have 4 vertices after approximation         // relatively large area (to filter out noisy contours)         // and be convex.         // Note: absolute value of an area is used because         // area may be positive or negative - in accordance with the         // contour orientation         if( approx.size() == 4 &&             fabs(cv::contourArea(cv::Mat(approx))) > 1000 &&             cv::isContourConvex(cv::Mat(approx)) )         {             double maxCosine = 0;              for (int j = 2; j < 5; j++)             {                 // Find the maximum cosine of the angle between joint edges                 double cosine = fabs(angle(approx[j%4], approx[j-2], approx[j-1]));                 maxCosine = MAX(maxCosine, cosine);             }              // If cosines of all angles are small             // (all angles are ~90 degree) then write quandrange             // vertices to resultant sequence             if( maxCosine < 0.3 )                 squares.push_back(approx);         }     }          }   // drawSquares: function draws all the squares found in the image void drawSquares( cv::Mat& image, const std::vector<std::vector<cv::Point> >& squares ) {     for( size_t i = 0; i < squares.size(); i++ )     {         const cv::Point* p = &squares[i][0];         int n = (int)squares[i].size();         cv::polylines(image, &p, &n, 1, true, cv::Scalar(0,255,0), 2, CV_AA);     }      cv::imshow("drawSquares", image); }

Ok, so our program begins at:

int main(int argc, char* argv[]) { // Load input image (colored, 3-channel) cv::Mat input = cv::imread(argv[1]); if (input.empty()) {     std::cout << "!!! failed imread()" << std::endl;     return -1; }     // Convert input image to grayscale (1-channel) cv::Mat grayscale = input.clone(); cv::cvtColor(input, grayscale, cv::COLOR_BGR2GRAY); //cv::imwrite("gray.png", grayscale);

What grayscale looks like:

// Threshold to binarize the image and get rid of the shoe cv::Mat binary; cv::threshold(grayscale, binary, 225, 255, cv::THRESH_BINARY_INV); cv::imshow("Binary image", binary); //cv::imwrite("binary.png", binary);

What binary looks like:

// Find the contours in the thresholded image std::vector<std::vector<cv::Point> > contours; cv::findContours(binary, contours, cv::RETR_LIST, cv::CHAIN_APPROX_SIMPLE);  // Fill the areas of the contours with BLUE (hoping to erase everything inside a rectangular shape) cv::Mat blue = input.clone();       for (size_t i = 0; i < contours.size(); i++) {     std::vector<cv::Point> cnt = contours[i];     double area = cv::contourArea(cv::Mat(cnt));                     //std::cout << "* Area: " << area << std::endl;      cv::drawContours(blue, contours, i, cv::Scalar(255, 0, 0),                       CV_FILLED, 8, std::vector<cv::Vec4i>(), 0, cv::Point() );          }         cv::imshow("Countours Filled", blue);   //cv::imwrite("contours.png", blue);

What blue looks like:

// Convert the blue colored image to binary (again), and we will have a good rectangular shape to detect cv::Mat gray; cv::cvtColor(blue, gray, cv::COLOR_BGR2GRAY); cv::threshold(gray, binary, 225, 255, cv::THRESH_BINARY_INV); cv::imshow("binary2", binary); //cv::imwrite("binary2.png", binary);

What binary looks like at this point:

// Erode & Dilate to isolate segments connected to nearby areas int erosion_type = cv::MORPH_RECT;  int erosion_size = 5; cv::Mat element = cv::getStructuringElement(erosion_type,                                              cv::Size(2 * erosion_size + 1, 2 * erosion_size + 1),                                              cv::Point(erosion_size, erosion_size)); cv::erode(binary, binary, element); cv::dilate(binary, binary, element); cv::imshow("Morphologic Op", binary);  //cv::imwrite("morpho.png", binary);

What binary looks like at this point:

// Ok, let's go ahead and try to detect all rectangular shapes std::vector<std::vector<cv::Point> > squares; findSquares(binary, squares); std::cout << "* Rectangular shapes found: "  << squares.size() << std::endl;  // Draw all rectangular shapes found cv::Mat output = input.clone(); drawSquares(output, squares); //cv::imwrite("output.png", output);

What output looks like:

Alright! We solved the first part of the problem which was finding the rounded rectangle. You can see in the image above that the rectangular shape was detected and green lines were drawn over the original image for educational purposes.

The second part is much easier. It begins by creating a ROI (Region of Interested) in the original image so we can crop the image to the area inside the rounded rectangle. Once this is done, the cropped image is saved on the disk as a TIFF file, which is then feeded to Tesseract do it's magic:

// Crop the rectangular shape if (squares.size() == 1) {         cv::Rect box = cv::boundingRect(cv::Mat(squares[0]));     std::cout << "* The location of the box is x:" << box.x << " y:" << box.y << " " << box.width << "x" << box.height << std::endl;      // Crop the original image to the defined ROI     cv::Mat crop = input(box);     cv::imshow("crop", crop);     //cv::imwrite("cropped.tiff", crop); } else {     std::cout << "* Abort! More than one rectangle was found." << std::endl; }  // Wait until user presses key cv::waitKey(0);  return 0; }

What crop looks like:

When this application finishes it's job, it creates a file named cropped.tiff on the disk. Go to the command-line and invoke Tesseract to detect the text present on the cropped image:

tesseract cropped.tiff out

This command creates a file named out.txt with the detected text:

enter image description here

Tesseract has an API that you can use to add the OCR feature into your application.

This solution is not robust and you will probably have to do some changes here and there to make it work for other test cases.

177

answered Oct 02 '22 14:10

karlphillip

There is a few alternatives: Java OCR implementation

They mention the next tools:

java ocr http://sourceforge.net/projects/javaocr/
aspire http://asprise.com/home/
Java Object Oriented Neural Engine http://www.jooneworld.com/
Ron Cemer Java OCR http://www.roncemer.com/software-development/java-ocr

And a few others.

This list of links can also be useful: http://www.javawhat.com/showCategory.do?id=2138003

Generally this kind of task requires lots of trial and testing. Probably the best tool depends much more the profile of your input data than anything else.

answered Oct 02 '22 14:10

Lajos Veres

Related questions
                            
                                EPPlus autofilter only working on last cell
                            
                                How to Make an AJAX HTTPS GET Request Using jQuery
                            
                                tinyMCE can no longer drag and drop images after upgrading from version 3 to version 4
                            
                                Using Selenium in Python to click/select a radio button
                            
                                Qt5 - setting background color to QPushButton and QCheckBox
                            
                                Get a few lines of HDFS data
                            
                                Unmount / destroy Component in jsdom test
                            
                                Swift Can't Import Sqlite3 iOS
                            
                                Guard::RSpec error: No cmd option specified, unable to run specs
                            
                                Select all text in a UITextField using Swift
                            
                                Locust : How to make locust run for a specific amount of time
                            
                                How can I extract keywords from a Python format string?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Image processing and extraction of characters

Tags:

somejkuser

People also ask

2 Answers

It is possible to solve this problem with OpenCV + Tesseract

karlphillip

Lajos Veres

Recent Activity

Donate For Us