Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Suggestions for digit recognition

I'm writing an Android app to extract a Sudoku puzzle from a picture. For each cell in the 9x9 Sudoku grid, I need to determine whether it contains one of the digits 1 through 9 or is blank. I start off with a Sudoku like this:

enter image description here

I pre-process the Sudoku using OpenCV to extract black-and-white images of the individual digits and then put them through Tesseract. There are a couple of limitations to Tesseract, though:

  1. Tesseract is large, contains lots of functionality I don't need (I.e. Full text recognition), and requires English-language training data in order to function, which I think has to go onto the device's SD card. At least I can tell it to only look for digits using tesseract.setVariable("tessedit_char_whitelist", "123456789");
  2. Tesseract often misinterprets a single digits as a string of digits, often containing newlines. It also sometimes just plain gets it wrong. Here are a few examples from the above Sudoku:

enter image description here

I have three questions:

  1. Is there any way I can overcome the limitations of Tesseract?
  2. If not, what is a useful, accurate method to detect individual digits (not k-nearest neighbours) that would be feasible to implement on Android - this could be a free library or a DIY solution.
  3. How can I improve the pre-processing to target that method? One possibility I've considered is using a thinning algorithm, as suggested by this post, but I'm not going to bother implementing it unless it will make a difference.
like image 863
1'' Avatar asked Nov 10 '12 05:11

1''


People also ask

Which algorithm is best for digit recognition?

In terms of accuracy score, the SVM classifier was the most accurate, whereas Decision Trees were the least! Hence, we conclude that both in terms of accuracy score and F1-score, the SVM classifier performed the best. That is why you will often see it used in image recognition problems as well!

What is digit recognition used for?

The applications of digit recognition include in postal mail sorting, bank check processing, form data entry, etc. The main problem lies within the ability on developing an efficient algorithm that can recognize hand written digits, which is submitted by users by the way of a scanner, tablet, and other digital devices.

What is future scope of handwritten digit recognition?

Recently handwritten digit recognition becomes vital scope and it is appealing many researchers because of its using in variety of machine learning and computer vision applications. However, there are deficient works accomplished on Arabic pattern digits because Arabic digits are more challenging than English patterns.


2 Answers

I took a class with one of the computer vision superstars who was/is at the top of the digit recognition algorithm rankings. He was really adamant that the best way to do digit recognition is...

1. Get some hand-labeled training data.
2. Run Histogram of Oriented Gradients (HOG) on the training data, and produce one
    long, concatenated feature vector per image
3. Feed each image's HOG features and its label into an SVM
4. For test data (digits on a sudoku puzzle), run HOG on the digits, then ask 
    the SVM classify the HOG features from the sudoku puzzle

OpenCV has a HOGDescriptor object, which computes HOG features. Look at this paper for advice on how to tune your HOG feature parameters. Any SVM library should do the job...the CvSVM stuff that comes with OpenCV should be fine.

For training data, I recommend using the MNIST handwritten digit database, which has thousands of pictures of digits with ground-truth data.

A slightly harder problem is to draw a bounding box around digits that appear in nature. Fortunately, it looks like you've already found a strategy for doing bounding boxes. :)

like image 122
solvingPuzzles Avatar answered Sep 28 '22 10:09

solvingPuzzles


Easiest thing is to use Normalized Central Moments for digit recognition. If you have one font (or very similar fonts it works good).

See this solution: https://github.com/grzesiu/Sudoku-GUI

In core there are things responsible for digit recognition, extraction, moments training. First time application is run operator must provide information what number is seen. Then moments of image (extracted square roi) are assigned to number (operator input). Application base on comparing moments.

Here first youtube movie shows how application works: http://synergia.pwr.wroc.pl/2012/06/22/irb-komunikacja-pc/

like image 28
krzych Avatar answered Sep 28 '22 10:09

krzych