Understanding Freeman chain codes for OCR

Tags:

Note that I'm really looking for an answer to my question. I am not looking for a link to some source code or to some academic paper: I've already used the source and I've already read papers and still haven't figured out the last part of this issue...

I'm working on some fast screen font OCRing and I'm making very good progress.

I'm already finding the baselines, separating the characters, transforming each character in black & white and then contouring each character in order to apply a Freeman chain code to it.

Basically it's an 8-connected chain code looking like this:

  3  2  1    \ | /   4-- --0    / | \   5  6  7

So if I have an 'a', after all my transformations (including transforming to black and white), I end up with something like this:

11110 00001 01111 10001 10001 01110

Then it's external countour may look like this (I may be making a mistake here, that's ASCII-art contouring and my 'algorithm' may get the contour wrong but that's not the point of my question):

 XXXX X1111X  XXXX1X X01111X X10001X X10001X  X111X   XXX

Following the Xs, I get the chain code, which would be:

0011222334445656677

Note that that's the normalized chain code but you can always normalized a chain code like this: you just keep the smallest integer.

(By the way, there's a super-efficient implementation to find the chain code where you simply take the 8 adjacent pixels of an 'X' and then look in a 256 lookup table if you have 0,1,2,3,4,5,6 or 7)

My question now, however, is: from that 0011222334445656677 chain code, how do I find that I have an 'a'?

Because, for example, if my 'a' looks like this:

11110 00001 01111 10001 10001 01111  <-- This pixel is now full

Then my chain code is now: 0002222334445656677

And yet this is also an 'a'.

I know that the whole point of these chain code is to be resilient to such tiny changes but I can't figure out how I'm supposed to find which character corresponds to one chain code.

I've been that far and now I'm stuck...

(By the way, I don't need 100% efficiency and things like differentiating '0' from 'O' or from 'o' isn't really an issue)

936

asked Jul 16 '11 15:07

SyntaxT3rr0r

1 Answers

What you need is a function d that measures the distance between chain codes. After then finding the letter to a given chain code is straightforward:

Input:

normalized chain codes S for the set of possible letters (generally the cain codes for A-Z, a-z, 0-9, ...)
chain code x of a letter which needs to be detected and which could be slightly deformed (the chain code wouldn't match any chain code in the set S)

The algorithm would iterate through the set of possible chain codes and calculate the distance d(x,si) for each element. The letter with the smallest distance would be the output of the algorithm (the identified letter).

I would suggest following distance function: For two chain codes, add up the length differences of each direction: d(x,si) = |x0-si0| + |x1-si1| + .. + |x7-si7|. x0 is the number of 0s in the chain code x, si0 is the number of 0s in the chain code si, etc.

An example will better explain what I'm thinking about. In the following image there are the letters 8, B and D, the fourth letter is a slightly deformed 8, which needs to be identified. The letters are written with Arial with font-size 8. The second line in the image is 10 times enlarged to better see the pixels.

enter image description here

I manually calculated (hopefully correct) the normalized chain codes which are:

8:  0011223123344556756677 B:  0000011222223344444666666666 D:  00001112223334444666666666 8': 000011222223344556756666 (deformed 8)

The length differences (absolut) are:

 direction | length         | difference to 8'           | 8 | B | D |  8'|   8 |  B |  D | ----------+---+---+---+----+-----+----+-----         0 | 2 | 5 | 4 |  4 |   2 |  1 |  0 |         1 | 3 | 2 | 3 |  2 |   1 |  0 |  1 |         2 | 3 | 5 | 3 |  5 |   2 |  0 |  2 |         3 | 3 | 2 | 3 |  2 |   1 |  0 |  1 |         4 | 2 | 5 | 4 |  2 |   0 |  3 |  2 |         5 | 3 | 0 | 0 |  3 |   0 |  3 |  3 |         6 | 3 | 9 | 9 |  5 |   2 |  4 |  4 |         7 | 3 | 0 | 0 |  1 |   2 |  1 |  1 | ----------+---+---+---+----+-----+----+-----                         sum   10 | 12 | 14 |

8' has the smallest distance to the chain code of 8, thus the algorithm would identify the letter 8. The distance to the letter B is not much bigger, but this is because the deformed 8 looks almost like the B.

This method is not scaling invariant. I think there are two options to overcome this:

For different font sizes, having different sets of normalized chain codes
One set of normalized chain codes at a big size (e.g. 35x46 pixel) and scaling the input letter (which needs to be identified) to this bigger size.

I'm not quite sure if the distance function is good enough for the set of all alphanumeric letters but I hope so. To minimize the error in identifying a letter you could include other features (not only chain codes) into the classification step. And again, you would need a distance measure -- this time for feature vectors.

106

answered Sep 23 '22 14:09

Christian Ammer

Related questions
                            
                                Should one prefer STL algorithms over hand-rolled loops?
                            
                                NumPy grouping using itertools.groupby performance
                            
                                Algorithm to implement kinetic scrolling
                            
                                Finding multiple entries with binary search
                            
                                Why is the complexity of computing the Fibonacci series 2^n and not n^2?
                            
                                Searching for an element in a circular sorted array
                            
                                Algorithm to rotate an array in linear time
                            
                                Calculating pow(a,b) mod n
                            
                                Implementation of C lower_bound
                            
                                Why Selection sort can be stable or unstable
                            
                                Rounding to nearest 100
                            
                                Fast Arc Cos algorithm?
                            
                                Your favourite algorithm and the lesson it taught you [closed]
                            
                                Where is my implementation of rot13 in JavaScript going wrong?
                            
                                Minimum exact cover of grid with squares; extra cuts
                            
                                for_each that gives two (or n) adjacent elements
                            
                                Mystical restriction on std::binary_search
                            
                                How can I generate Perlin noise on a spherical surface?
                            
                                Are there any good / interesting analogs to regular expressions in 2d?
                            
                                Kinect pattern recognition [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Understanding Freeman chain codes for OCR

Tags:

algorithm

ocr

SyntaxT3rr0r

People also ask

1 Answers

Christian Ammer

Recent Activity

Donate For Us