Prepare complex image for OCR

Tags:

I want to recognize digits from a credit card. To make things worse, the source image is not guaranteed to be of high quality. The OCR is to be realized through a neural network, but that shouldn't be the topic here.

The current issue is the image preprocessing. As credit cards can have backgrounds and other complex graphics, the text is not as clear as with scanning a document. I made experiments with edge detection (Canny Edge, Sobel), but it wasn't that successful. Also calculating the difference between the greyscale image and a blurred one (as stated at Remove background color in image processing for OCR) did not lead to an OCRable result.

I think most approaches fail because the contrast between a specific digit and its background is not strong enough. There is probably a need to do a segmentation of the image into blocks and find the best preprocessing solution for each block?

Do you have any suggestions how to convert the source to a readable binary image? Is edge detection the way to go or should I stick with basic color thresholding?

Here is a sample of a greyscale-thresholding approach (where I am obviously not happy with the results):

Original image:

Original image

Greyscale image:

Greyscale image

Thresholded image:

Thresholded image

Thanks for any advice, Valentin

640

asked Feb 22 '12 16:02

valentin

1 Answers

If it's at all possible, request that better lighting be used to capture the images. A low-angle light would illuminate the edges of the raised (or sunken) characters, thus greatly improving the image quality. If the image is meant to be analyzed by a machine, then the lighting should be optimized for machine readability.

That said, one algorithm you should look into is the Stroke Width Transform, which is used to extract characters from natural images.

Stroke Width Transform (SWT) implementation (Java, C#...)

A global threshold (for binarization or clipping edge strengths) probably won't cut it for this application, and instead you should look at localized thresholds. In your example images the "02" following the "31" is particularly weak, so searching for the strongest local edges in that region would be better than filtering all edges in the character string using a single threshold.

If you can identify partial segments of characters, then you might use some directional morphology operations to help join segments. For example, if you have two nearly horizontal segments like the following, where 0 is the background and 1 is the foreground...

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 1 1 1 0 0 1 1 1 1 1 1 0 0 0
0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 0 0

then you could perform a morphological "close" operation along the horizontal direction only to join those segments. The kernel could be something like

x x x x x
1 1 1 1 1
x x x x x

There are more sophisticated methods to perform curve completion using Bezier fits or even Euler spirals (a.k.a. clothoids), but preprocessing to identify segments to be joined and postprocessing to eliminate poor joins can get very tricky.

answered Oct 20 '22 09:10

Rethunk

Related questions
                            
                                Can inception model be used for object counting in an image?
                            
                                Segmentation with Single Point Class Annotations via Graph Cuts?
                            
                                What is imbalance in image segmentation?
                            
                                Affine Transform, Simple Rotation and Scaling or something else entirely?
                            
                                VLFeat - How to fix "Warning: Name is nonexistent or not a directory"?
                            
                                How to remove hidden marks from images using python opencv?
                            
                                Android: BitmapFactory.decodeByteArray gives pixelated bitmap
                            
                                Google similar images algorithm [closed]
                            
                                Simple algorithm to crop empty borders from an image by code?
                            
                                How to Locate Alignment Marks in an Image
                            
                                Besides standard/progressive, the 3rd kind of JPEG compression: load by channel?
                            
                                how to merge Images and impose on each other
                            
                                Geometric warp of image in python
                            
                                Opencv: Convert floorplan image into data model
                            
                                How to create an edge preserving blur (similar to a bilateral filter) using a limited set of primitive operations
                            
                                Raspberry Pi HDMI input & HDMI output for image processing
                            
                                shape Detection - TensorFlow
                            
                                edge detection issue on Text detection in images
                            
                                Python OpenCV: Rubik's cube solver color extraction
                            
                                Extracting line segments from a hough transform

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Prepare complex image for OCR

Tags:

image-processing

ocr

edge-detection

valentin

People also ask

1 Answers

Rethunk

Recent Activity

Donate For Us