Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Object Detection for android with tesseract or OpenCV

I have successfully integrated tesseract into my android app and it reads whatever the image that I capture but with very less accuracy. But most of the time I do not get the correct text after capturing because some text around the region of interest is also getting captured.

All I want to read is all text from a rectangular area, accurately, without capturing the edges of the rectangle. I have done some research and posted on stackoverflow about this two times, but still did not get a happy result!

Following are the 2 posts that I made:

https://stackoverflow.com/questions/16663504/extract-text-from-a-captured-image?noredirect=1#comment23973954_16663504

Extracting information from captured image in android

I am not sure whether to go ahead with tesseract or use openCV

like image 295
TharakaNirmana Avatar asked Jun 21 '13 14:06

TharakaNirmana


1 Answers

Including the many links and answers from others, I think it's good to take a step back and note that there are actually two fundamental steps to optical character recognition (OCR):

  • Text Detection: This is the title and focus of your question, and it is concerned with localizing regions in an image that contain text.
  • Text Recognition: This is where the actual recognition happens, where the localized image regions from detection get segmented character-by-character and classified. This is also where tools like Tesseract come into play.

Now, there are also two general settings in which OCR is applied:

  • Controlled: These are images taken from a scanner or similar in-nature where the target is a document and things like perspective, scale, font, orientation, background consistency, etc are pretty docile.
  • Uncontrolled/Scene: These are the more natural and in-the-wild photos, e.g. those taken from a camera, where you are trying to recognize a street sign, shop name, etc.

Tesseract as-is is most applicable to the "controlled" setting. And in general, but for scene OCR especially, "re-training" Tesseract will not directly improve detection, but may improve recognition.

If you are looking to improve scene text detection, see this work; and if you are looking at improving scene text recognition, see this work. Since you asked about detection, the detection reference uses maximally stable extremal regions (MSER), which has a plethora of implementation resources, e.g. see here.

There's also a text detection project here specifically for Android too:
https://github.com/dreamdragon/text-detection

As many have noted, keep in mind that recognition is still an open research challenge.

like image 135
bjou Avatar answered Oct 20 '22 02:10

bjou