Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

c#-tesseract get space recoginition in digits

Tags:

c#

tesseract

I am new in tesseract and I am making a class project in which I need to scan number matrices. I have been successful in reading numbers from an image file but I haven't found yet how to recognize spacing between digits. For example currently I am getting 14610 for 1 4 6 10.

Image:

enter image description here

Code I am currently using:

Bitmap myBmp = new Bitmap(file);
var image = myBmp;
var ocr = new Tesseract();
ocr.SetVariable("tessedit_char_whitelist", "0123456789"); // If digit only 

ocr.Init(@"C:\Users\MuhammadShahroz\Documents\Visual Studio 2013\Projects\ConsoleApplication3\tessdata", "eng", false);
var results = ocr.DoOCR( image, Rectangle.Empty);

foreach (Word word in results)
{
    Console.WriteLine("{0} : {1}", word.Confidence, word.Text);
    mystring = String.Format("{0 } ",word.Text);
}
like image 235
Muhammad Shehroz Sajjad Avatar asked Dec 21 '15 20:12

Muhammad Shehroz Sajjad


People also ask

What C is used for?

C programming language is a machine-independent programming language that is mainly used to create many types of applications and operating systems such as Windows, and other complicated programs such as the Oracle database, Git, Python interpreter, and games and is considered a programming foundation in the process of ...

What is the full name of C?

In the real sense it has no meaning or full form. It was developed by Dennis Ritchie and Ken Thompson at AT&T bell Lab. First, they used to call it as B language then later they made some improvement into it and renamed it as C and its superscript as C++ which was invented by Dr.

Is C language easy?

C is a general-purpose language that most programmers learn before moving on to more complex languages. From Unix and Windows to Tic Tac Toe and Photoshop, several of the most commonly used applications today have been built on C. It is easy to learn because: A simple syntax with only 32 keywords.

What is C in C language?

What is C? C is a general-purpose programming language created by Dennis Ritchie at the Bell Laboratories in 1972. It is a very popular language, despite being old. C is strongly associated with UNIX, as it was developed to write the UNIX operating system.


1 Answers

I think you will need to set variable preserve_interword_spaces=1 (see configuration source)

like image 78
nguyenq Avatar answered Oct 20 '22 19:10

nguyenq