Is there a solution that does a good job at numeric (1-10) handwriting? I tried tesseract but I'm getting only garbage.
Ideally OSS, but commercial would be OK too.
Traditional OCR is all about technology that has “studied” fonts and symbols enough to be able to identify almost all variations of machine-printed text. But therein lies the limitations of traditional OCR: while it's great for extracting text from paper, it can't read handwriting.
Handwriting recognition, also known as handwriting OCR or cursive OCR, is a subfield of OCR technology that translates handwritten letters to corresponding digital text or commands in real-time. To perform this task, these systems benefit from pattern matching to identify various styles of handwritten letters.
Handwriting detection with Optical Character Recognition (OCR) The Vision API can detect and extract text from images: DOCUMENT_TEXT_DETECTION extracts text from an image (or file); the response is optimized for dense text and documents. The JSON includes page, block, paragraph, word, and break information.
OpenCV now comes with handwritten digit recognition OCR sample. You can refer to it : http://code.opencv.org/projects/opencv/repository/revisions/master/entry/samples/python2/digits.py
It uses both kNN and SVM to train some handwritten digits and then apply OCR on it.
Below is the output of kNN training (it has an error of only 3.5%) :
I came across your post while doing a search and it drew me into some interesting research. I'll share some of my findings with you and the forum:
Research into handwriting recognition (ICR, neural-network-based) and resulting OCR solutions have really taken off of late. Many algorithms have been proposed in the last decade, still handwriting recognition remains a challenge!
First, some pointers to free programming resources (MB, you say nothing about your programming environment or OS, so my suggestions cover mainly Windows.)
www.codeproject.com/Articles/143059/Neural-Network-for-Recognition-of-Handwritten-Digi That's the page of the Neural Network for Recognition of Handwritten Digits in C# This is a rudimentary rework, in C#, of Mike O'Neill's Neural Network for Recognition of Handwritten Digits which employs the MFC/C++ model. More about O'Neill's work here: www.codeproject.com/KB/library/NeuralNetRecognition.aspx.
http://asprise.com/product/ocr/selector.php Their solution is for Visual Basic, so Windows is the main appeal, but I see they support Linux, Mac OS and some flavors of UNIX. The SDKs are free! Download from here: www.asprise.com/product/ocr/download.php?lang=vb
(I have seen positive feedback about asprise on another place in this forum: Window 7 OCR API)
Here's another great resource - a handwriting database for testing OCR abilities, here: www.yann.lecun.com/exdb/mnist/index.html That's the main page of the MNIST database of handwritten digits (60,000 examples, and a test set of 10,000 examples) The site says that the digits have already been size-normalized and centered in a fixed-size image, so you can concentrate on learning techniques and pattern recognition methods on real-world data and minimize efforts on preprocessing and formatting.
MNIST is a (free) subset of the larger, NIST database, available (at a cost) here: www.nist.gov/srd/nistsd19.cfm
www.captricity.com/handwriting-ocr-software/ This solution handles ANY free-form handwriting inputs.
The software transform all static documents, on paper, PDFs, fax, efax, and also handwriting and human mark-up of any kind. They have "pay as you go" model (10 pages for free) and a minimal paid solution of $75/month.
www.cvisiontech.com/trapeze/general/trapeze.html?lang=eng CVISION's Trapeze for Forms processes handwriting inputs inside field-data from structured forms.
It validates the information and outputs this information into a document database.
HTH some!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With