Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optical character recognition program for photographs

Tags:

matlab

ocr

I need to develop an optical character recognition program in Matlab (or any other language that can do this) to be able to extract the reading on this photograph.

The program must be able to upload as many picture files as possible since I have around 40000 pictures that I need to work through.

The general aim of this task is to record intraday gas readings from the specific gas meter shown in the photograph. The is a webcam currently setup that is programmed to photgraph the readings every minute and so the OCR program would help in then having historic intraday gas reading data.

Which is the best software to do this in and are there any online sources that are available for this??

like image 402
Apollon1954 Avatar asked Jan 27 '11 18:01

Apollon1954


2 Answers

I'd break down the basic recognition steps as follows:

  1. Locate meter display within the image
  2. Isolate and clean up the digits
  3. Calculate features
  4. Classify each digit using a model you've trained using historic examples

Assuming that the camera for a particular location does not move, step 1 will only need to be performed once. Step 2 will include things like enhancing contrast and filtering noise. Step 3 can include any useful calculations you can think of, such as mean and skew of "ink" (white) pixels. Step 4 would utilize a model you build to classify a single digit as '0', '1', ... '9', and could be accomplished using k-nearest neighbors, logistic regression, SVM, neural network, etc.

like image 131
Predictor Avatar answered Sep 30 '22 12:09

Predictor


A couple of things would make 1 in Predictor's answere easy: Placing the cam directly above the meter, adding sufficient light, maybe placing bright pink strips around the meter to help segment out the display :).

Once you do this, and the cam remains fixed, you can use a manual process once and then have it applied to all subsequent images to segment out the digits. If the lighting is good and consistent, you might just be able to use simple template matching to identify each of the segmented digits.

Actually, once you get a sample of all the digits, you might even be able to classify them on something simpler (like sum of thresholded pictures).

like image 41
Ashish Uthama Avatar answered Sep 30 '22 12:09

Ashish Uthama