I have PDF files or JPG images that I need to print (these are formulars that will be filled in by hand), and I need that:
after scan, the image should be rotated / scaled perfectly with Python, so that I can access to certain parts of the formular with x, y coordinates
the scanned images should be recognized with a number ID.
What kind of pattern should I add on the formular paper, so that a Python library (which one?) will be able to crop it / rotate it / identify it with an ID?
I was thinking about adding a QR-code (containin a number ID) on top left of paper, and a QR-code on bottom right of the paper, or maybe also "hirondelles" symbols:
The QR code would be generated with qrcode library:
import qrcode
img = qrcode.make('ID1138934')
and added on top the PDF with this method.
Note:
I've read the question How to decode a QR-code image in (preferably pure) Python?, but this does not detect the QR-code on a scanned paper; it only takes a PNG containing the QR-code only as input.
Here is a JPG, example of scanned document:
This task needs the combination of two techniques:
First step: detect the x,y coordinates of the 4 corners of each QR code. The Python library zbar
is useful for this. The code in this question + the answer shows how to do it.
Second step: now a deskewing / perspective correction / "homography" is needed. Here is how to do it with Python + OpenCV.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With