Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Tesseract can't recognize this font

I have this image:

alt text

I want to read it to a string using python, which I didn't think would be that hard. I came upon tesseract, and then a wrapper for python scripts using tesseract.

So I started reading images, and it's done great until I tried to read this one. Am i going to have to train it to read that specific font? Any ideas on what that specific font is? Or is there a better ocr engine I could use with python to get this job done.

Edit: Perhaps I could make some sort of vector around the numbers, then redraw them in a larger size? The larger images are the better tesseract ocr seems to read them (no surprise lol).

like image 509
codygman Avatar asked Nov 19 '09 11:11

codygman


1 Answers

Just train the engine for the 10 digits and a '.' . That should do it. And make sure you change your image to grayscale before OCRing it.

like image 50
debayan Avatar answered Oct 02 '22 06:10

debayan