Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OCR TesseractEngine

Tags:

c#

ocr

I am using OCR to recognize digits on picture

enter image description here

var engine = new TesseractEngine(@"C:\Projects\tessdata", "eng", EngineMode.Default,);
var currentImage = TakeScreen();
var page = engine.Process(ScaleByPercent(currentImage, 500));
var text = page.GetText().Replace("\n", "");

Scale:

public Bitmap ScaleByPercent(Bitmap imgPhoto, int Percent)
    {
        float nPercent = ((float)Percent / 100);

        int sourceWidth = imgPhoto.Width;
        int sourceHeight = imgPhoto.Height;
        var destWidth = (int)(sourceWidth * nPercent);
        var destHeight = (int)(sourceHeight * nPercent);

        var bmPhoto = new Bitmap(destWidth, destHeight,
                                 PixelFormat.Format24bppRgb);
        bmPhoto.SetResolution(imgPhoto.HorizontalResolution,
                              imgPhoto.VerticalResolution);

        Graphics grPhoto = Graphics.FromImage(bmPhoto);
        grPhoto.InterpolationMode = InterpolationMode.HighQualityBicubic;

        grPhoto.DrawImage(imgPhoto,
                          new System.Drawing.Rectangle(0, 0, destWidth, destHeight),
                          new System.Drawing.Rectangle(0, 0, sourceWidth, sourceHeight),
                          GraphicsUnit.Pixel);
        bmPhoto.Save(@"D:\Scale.png", System.Drawing.Imaging.ImageFormat.Png);
        grPhoto.Dispose();
        return bmPhoto;
    }

But i get result "10g".

  1. How to force engine recognize only digits?
  2. How to get number 1013.
like image 941
A191919 Avatar asked Jul 12 '16 18:07

A191919


1 Answers

You can tell the Tesseract Engine to only look for digits by using the following code :

var  engine = new TesseractEngine(@"C:\Projects\tessdata", "eng", EngineMode.Default);
                engine.SetVariable("tessedit_char_whitelist", "0123456789");
like image 149
Strickos9 Avatar answered Sep 22 '22 04:09

Strickos9