Tesseract or any other OCR lib

Question

I'm looking for an explanation / API doc / examples of how to use (and train?) Tesseract in C++, nothing useful on the google Tesseract page, and yet to find something over the web.

Anyone useful sources, experiences would be more than welcome, as I have no idea how to begin with it.

P.S:

I'm open for suggestions on other libraries.
Only FREE libraries

Richard Woolf · Accepted Answer

I have some experience with Tesseract... a simple google of 'training tesseract' reveals this page: http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract where you must choose which version of tesseract you wish to train.. While 3 is the latest version, it's brand new and thus people are still ironing out any issues - im still using version 2.4. Anyways, you'll see there are about 9 steps in training tesseract for a particular 'language' (or what should have been called 'fonts' or 'character-sets'). You could also just use the existing 'eng' language - but it depends on your application. For example, in my application I would have to do the document analysis and take a particular region and want to OCR a 13-character string of numbers - and I needed high accuracy - and I didn't want it reading '5' as 'S' and '0' as 'O' etc, so it was logical to create a particular 'language' of my particular font-set consisting only of the characters 0..9, whereas you might not care if you get extra 'noise

Tesseract or any other OCR lib

Tags:

c++

ocr

tesseract

image-recognition

snoofkin

1 Answers

Richard Woolf

Recent Activity

Donate For Us

Tesseract or any other OCR lib

Tags:

c++

ocr

tesseract

image-recognition

snoofkin

1 Answers

Richard Woolf

Related questions

Recent Activity

Donate For Us