I need to recognise numbers from the camera image on iPhone, in real-time. I know there will be no more than 5 digits on the image.
Is this problem realistic to solve given the computational specifications of the iPhone? Does anyone have any experience using the Tesseract OCR library, and do you think it could be solved by using it?
The depends on your definition of "real-time", but yes, it should be possible to do relatively fast recognition of just the digits 0-9 on an iPhone 4, particularly if you can fonts, lighting conditions, etc. that they will appear in.
I highly recommend reading the article on how Sudoku Grab does its recognition of puzzles using the iPhone camera. In their case, a trained neural network was used to identify the digits, which should be reasonably simple and fast on modern iOS hardware.
The current recognition libraries out there, like OpenCV, will use the iPhone's CPU to do the processing. I've heard that they can do even more complex tasks like facial recognition fast enough to use with video sources while showing a minimal amount of stutter.
For even better performance, I believe that there's a lot of potential in the programmable GPUs on the newer iOS devices. In my benchmarks, I saw a 14X - 28X speedup when using the iPhone 4's GPU for simple image processing. While few people are looking at this right now, something like Sudoku Grab's neural network should be a parallel enough process to benefit from running on the GPU.
It should be computationally possible. There are apps that can get a bar code in real time and also an app that does real time translation. (Word Lens). I'm not sure what libraries they use, however.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With