I am designing a system that will scan in standardized forms to images (e.g., TBitmap). I would like to identify alignment marks on these pages and use the locations of these crop marks to rotate the page to its proper orientation (so top is actually up) and to crop the image to the location of the alignment marks.
An example image of a typical mark I'd need to locate is:
(source: tpub.com)
What are techniques to evaluate an image obtained from a scanner to locate various marks within the image? I'd need to locate multiple marks and their center point locations.
Image alignment is the process of overlaying images of the same scene under different conditions, such as from differ- ent viewpoints, with different illumination, using different sensors, or at different times.
Just brainstorming some possible approaches.
Template Matching
A brute-force method would be to have a bitmap image of what a registration mark should look like. Then, for every possible rectangle in the image that has the same width and height as the template bitmap, you compare the image pixels to the template pixels. If most of the corresponding pixels match, you've probably found a registration mark. This is very compute intensive because you have to scan over all possible positions, rotations, scale factors, etc. You can whittle this down by taking advantage of things you know. For example, your registration mark is symmetric, so you don't need to check all possible rotations. Perhaps you know the exact size the mark should be and thus can avoid iterating over different scale factors. Finally, you might know that the registration marks should be near the corners and thus can skip over most of the middle of the image.
Interesting Points
Find a way to identify "interesting points" in the image. For example, points that seem to be at the center of an intersection could be found by doing a convolution with a small kernel that reinforces pixels that have matching pixels in the cardinal directions and then threshold the result. This gives a list of pixels that seem to be intersection points (there might be some noise). You can search this subset of coordinates for a "constellation" that looks like the five intersection points in your registration mark. You might still need to apply template matching to find the most likely positions, but this would vastly reduce the number of locations, rotations, and scale factors that you'd otherwise have to try.
Feature Detection
There are algorithms for line detection, circle detection, etc. You might be able to run a bunch of these and then look for a combination of two crossing line segments within a circle. This may be the most robust way, but it's probably also the hardest to get working.
Some preprocessing steps, like running edge detectors, thresholding, or dilation, and erosion filters might also help if the images aren't real clean to begin with.
I found this french PDF resource by Colin BOUVRY dealing with Recognition of characters and symbols etched on glass.
If you are not comfortable with french, you don't have to worry: A bunch of valuable source codes in Delphi are listed at the bottom of the document, believe me!
Thanks.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With