Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does anyone know any API for OCRing 7-Segment Display for Windows Phone?

I'm trying to develop a Windows Phone 8.1 App but I need to recognize some numbers from different Displays.

enter image description here

enter image description here

enter image description here

I was following this example:

http://bsubramanyamraju.blogspot.com/2014/08/windowsphone-81-optical-character.html

That is using the Microsoft OCR Runtime Library:

https://www.nuget.org/packages/Microsoft.Windows.Ocr/

However, it doesn't work when I'm trying to recognize those kinds of pics. Even I found this site:

https://www.unix-ag.uni-kl.de/~auerswal/ssocr/

Does anyone have a recommendation? Or Does anyone know any code related to it?

Thank for your worthy knowledge.

like image 470
Federico Navarrete Avatar asked Feb 08 '15 21:02

Federico Navarrete


1 Answers

I wish the answer to your question would be "Sure, here it is" with link to a black-box process-anything OCR tool, but there are several aspects involved, which are best considered separately.

First, there is some work on image pre-processing BEFORE you even consider any OCR. Your image samples are very drastically different, and include full range of issues.

SAMPLE 1 has low contrast, so when it is binarized to black and white layer, which most OCR will perform internally at some stage, there are no characters to process. It looks like this after binarization: enter image description here

See this OCR Blog post for additional details on image pre-processing: http://www.ocr-it.com/guide-to-better-mobile-images-from-cell-phone-camera-for-higher-quality-ocr.

Secondly, the image has no dpi information in the header, which some OCR technologies use to determine appropriate scaling of the image. Without header information, some OCR programs may set some default dpi, which may or may not match your image, thus affecting the OCR result. This is NOT critical, but preferred if this can be implemented at the time of picture creation.

SAMPLE 2 has sufficient contrast and adaptive notarization returns a clear image. It is also missing dpi resolution value in the header. enter image description here

SAMPLE 3 has very clear contrast, but it also has no resolution dpi in the header.

Once you have images that are optimized for OCR processing, the next step is to look at OCR technologies.

I did NOT test the once you mentioned, assuming you had correct implementation and yet no success with them. I tested other OCR tools I have used in the past.

In general, there is no 7-segment OCR known to me. However, I was able to adapt to other generic OCR for this specialized task. Every OCR I tried'out-of-box' or with default settings is unable to handle this recognition. And it is logical and expected. Why? Because most generic OCR are written to recognize inseparable pixel patterns for each character. This is related to the "character separability" principle used to separate words into separate characters. In other words, inner OCR algorithms look for connected strokes which make up each character. More powerful commercial OCR allows some breaks in pixel patterns, but they are expected to be minimal to none, like defects in print or scan, which may result in missing character pieces.

7-segment display by nature will have multiple breaks in each character, conflicting with the character separability principle.

More powerful OCR technologies have a) more tolerance to breaks in pixel patterns and/or b) have special settings to handle these cases.

I will perform further testing with OCR-IT web-based OCR API platform, which is well known to me. I worked as a developer on its OCR capabilities. I also use it extensively in my own iOS and Android apps. OCR-IT API is based on a strong commercial OCR engine, so it is having good tolerance to character imperfections as well as some controls to help in this case.

SAMPLE 3. This is the easiest sample to process, so I tested it first. Using OCR-IT API, and making a request with default settings, requesting the output to TXT format, I get the following: enter image description here

It appears that OCR is a) segmenting characters into two separate lines, and b) tries to read resulting patters as close as possible to valid characters.

enter image description here

Based on this quick analysis, making one adjustments to OCR settings results in the following recognition:

enter image description here

The setting that made substantial difference in OCR result is switching from default print type to using "DotMatrix", which is in the middle of this entire OCR-IT API settings XML:

<Job> 
 <InputURL>http://i.stack.imgur.com/wOtFx.jpg</InputURL>
  <CleanupSettings>
      <Deskew>false</Deskew>
      <RemoveGarbage>false</RemoveGarbage>
      <RemoveTexture>false</RemoveTexture>
      <RotationType>NoRotation</RotationType>
  </CleanupSettings>
  <OCRSettings>
      <PrintType>DotMatrix</PrintType>
      <OCRLanguage>English</OCRLanguage>
      <SpeedOCR>false</SpeedOCR>
      <AnalysisMode>MixedDocument</AnalysisMode>
      <LookForBarcodes>false</LookForBarcodes>
  </OCRSettings>
  <OutputSettings>
      <ExportFormat>Text</ExportFormat>
  </OutputSettings>
</Job>

The use of DotMatrix print type turned on necessary algorithms to increase tolerance for breaks in character structure, which commonly occurs by nature of dot-matrix printers in dot-matrix prints. Alternatively, a "Typewriter" print type could be used, since character breaks are also expected in typewritten fonts, thus being automatically handled by OCR.

There could be one more change to the API setting to run OCR using "Digits" character set (language), effectively eliminating any possibility of misreading 1 as I, etc.

SAMPLE 2. In this sample, the gaps in each character's structure are much wider. Even standard algorithms for handling DotMatrix or Typerwriter print types cannot accommodate these wide gaps. The use of all possible setting variations returned something like this: enter image description here

Character segmentation seems to be the issue. One technical solution goes back to image pre-processing. A simple algorithm can be implemented to fill in gaps between each segment of the 7-segment character. It does not have to be very precise, something like this:

enter image description here

But that is enough to produce a perfect OCR result.

enter image description here

Since it may be unknown in advance which 7-segment LCD display will require filled in gaps, and which does not, I recommend applying this algorithm to all LCD 7-segment images, with small or large gaps. I would limit the size of the gap to no wider than the width of a segment. Given these screens come in various background and segment colors, this pre-procession algorithm can be substantially simplified if it is performed on binarized (black & white) image.

Overall, this task is possible with OCR and near out-of-box functionality, assuming that some image pre-processing is performed. In general, I believe that image pre-processing is required for any OCR-related project anyway, specific to that project.

If you have any further questions about OCR or image pre-processing, pm me.

like image 175
Ilya Evdokimov Avatar answered Oct 18 '22 18:10

Ilya Evdokimov