I would like to build an Android application that, via an OCR library, should scan a picture extracting text from it .
What Java library should I use?
Don't know how good it is (it definitely needs to be trained first), but there is Ron Cemer's Java OCR library.
If you are looking for a very extensible option or have a specific problem domain you could consider rolling your own using the Java Object Oriented Neural Engine.
I used it successfully in a personal project to identify the letter from an image such as this, you can find all the source for the OCR component of my application on github, here.
try tesseract, checkout this article http://www.itwizard.ro/interfacing-cc-libraries-via-jni-example-tesseract-163.html and this example http://code.google.com/p/mezzofanti/
Edit: some more facts - tesseract is one of the best open source OCR used by google - there is training data available for many languages - mezzofanti is an android app that uses tesseract - beware: OCR does use a lot of CPU power. trying to OCR a A4 page with your T-Mob G1 will take a lot of time and the result may not impress you ;-)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With