Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there an OCR library that outputs coordinates of words found within an image? [closed]

Tags:

ocr

In my experience, OCR libraries tend to merely output the text found within an image but not where the text was found. Is there an OCR library that outputs both the words found within an image as well as the coordinates (x, y, width, height) where those words were found?

like image 730
Adam Paynter Avatar asked Feb 18 '11 12:02

Adam Paynter


People also ask

How do I use Tesseract to read text from an image?

Create a Python tesseract script Create a project folder and add a new main.py file inside that folder. Once the application gives access to PDF files, its content will be extracted in the form of images. These images will then be processed to extract the text.

How does OCR Tesseract work?

Tesseract tests the text lines to determine whether they are fixed pitch. Where it finds fixed pitch text, Tesseract chops the words into characters using the pitch, and disables the chopper and associator on these words for the word recognition step.

Is Tesseract OCR good?

While Tesseract is known as one of the most accurate free OCR engines available today, it has numerous limitations that dramatically affect its performance; its ability to correctly recognize characters in a scan or image.

Is Tesseract OCR free?

Tesseract is an optical character recognition engine for various operating systems. It is free software, released under the Apache License.


2 Answers

Most commercial OCR engines will return word and character coordinate positions but you have to work with their SDK's to extract the information. Even Tesseract OCR will return position information but it has been not easy to get to. Version 3.01 will make easier but a DLL interface is still being worked on.

Unfortunately, most free OCR programs use Tesseract OCR in its basic form and they only report the raw ASCII results.

www.transym.com - Transym OCR - outputs coordinates. www.rerecognition.com - KADMOS engine returns coordinates.

Also Caere Omnipage, Mitek, Abbyy, Charactell return character positions.

like image 146
Andrew Cash Avatar answered Sep 18 '22 17:09

Andrew Cash


I'm using TessNet (a Tesseract C# wrapper) and I'm getting word coordinates with the following code:

TextWriter tw = new StreamWriter(@"U:\user files\bwalker\ocrTesting.txt"); Bitmap image = new Bitmap(@"u:\user files\bwalker\2849257.tif"); tessnet2.Tesseract ocr = new tessnet2.Tesseract(); // If digit only ocr.SetVariable("tessedit_char_whitelist", "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz.,$-/#&=()\"':?"); // To use correct tessdata ocr.Init(@"C:\Users\bwalker\Documents\Visual Studio 2010\Projects\tessnetWinForms\tessnetWinForms\bin\Release\", "eng", false);  List<tessnet2.Word> result = ocr.DoOCR(image, System.Drawing.Rectangle.Empty); string Results = ""; foreach (tessnet2.Word word in result) {     Results += word.Confidence + ", " + word.Text + ", " +word.Top+", "+word.Bottom+", "+word.Left+", "+word.Right+"\n"; } using (StreamWriter writer = new StreamWriter(@"U:\user files\bwalker\ocrTesting2.txt", true)) {     writer.WriteLine(Results);//+", "+word.Top+", "+word.Bottom+", "+word.Left+", "+word.Right);     writer.Close(); } MessageBox.Show("Completed"); 
like image 28
Ben Walker Avatar answered Sep 19 '22 17:09

Ben Walker