My software needs to read a fixed-length handwritten number.
While I could use a general-purpose library like Tesseract, I am sure there is something smarter. Tesseract will probably misinterpret some of the 1 or 7 as I or l, whereas a software that expects only numbers would not.
Knowing that there are only numbers (American-English way of writing them), the algorithm could focus on 10 potential matches instead of hundreds of symbols.
Any experience OCRing handwritten number-only fields?
What open source library/software did you get the best results with?
From the FAQ of Tesseract:
How do I recognize only digits?
In 2.03 and above:
Use
TessBaseAPI::SetVariable("tessedit_char_whitelist", "0123456789");
before calling an Init function or put this in a text file called
tessdata/configs/digits
:tessedit_char_whitelist 0123456789
and then your command line becomes:
tesseract image.tif outputbase nobatch digits
Warning: Until the old and new config variables get merged, you must have the
nobatch
parameter too.
But I think since it was designed for printed—not handwritten—text, accuracy might suffer even for digits only.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With