The scenario goes like this: I have a picture of a paper that I would like to do some OCR. So take the image below as my input example:
After successfully detecting the area that corresponds to the paper I'm left with a vector<Point>
of 4 coordinates that define its location inside the image. Note that these coordinates will probably not correspond to a perfect rectangle due to the distance of the camera and angle when the picture was taken. For viewing purposes I connected the points in the sub-image so you can see what I mean:
In this case, the points are: [1215, 43] , [52, 67] , [56, 869] and [1216, 884]
At this moment, I need to adjust these points so they become aligned horizontally. What do I mean by that? If you notice the area of the sub-image above, it is a little rotated: the points on right side of the image are positioned a little higher than points on the other side.
In other words, we have image A, which was exaggerated on purpose to look a little more distorted/rotated than reality, and then image B - which is what I would like as the final result of this procedure:
A) B)
I'm not sure which techniques could be used to achieve this transformation. The application also needs to detect automatically how much rotation needs to be done, as I don't have control over the image acquisition procedure.
The purpose is to have a new Mat
with the normalized sub-image. I'm not worried about a possible image distortion right now, I'm just looking for a way to identify how much rotation needs to be done on the sub-image and how to apply it and get a more rectangular area.
Geometric contraction, expansion, dilation, reflection, rotation, shear, similarity transformations, spiral similarities, and translation are all affine transformations, as are their combinations.
The Affine Transformation is a general rotation, shear, scale, and translation distortion operator. That is it will modify an image to perform all four of the given distortions all at the same time.
Affine transformation is a linear mapping method that preserves points, straight lines, and planes. Sets of parallel lines remain parallel after an affine transformation. The affine transformation technique is typically used to correct for geometric distortions or deformations that occur with non-ideal camera angles.
Examples of affine transformations include translation, scaling, homothety, similarity, reflection, rotation, shear mapping, and compositions of them in any combination and sequence.
I think http://felix.abecassis.me/2011/10/opencv-rotation-deskewing/ and http://felix.abecassis.me/2011/10/opencv-bounding-box-skew-angle/ will come in handy. The aforementioned posts don't cover perspective warping (only rotation). To get the best results, you'll have to use warpPerspective
(maybe in conjunction with getRotationMatrix2D
). Use the angles between line segments to find out how much you need to warp the perspective. THe assumption here is that they should always be 90 degrees and that the closest one to 90 degrees is the "closest" vector as far as the perspective is concerned.
Don't forget to normalize your vectors!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With