I'm building an automated electricity / gas meter reader using OpenCV and Python. I've got as far as taking shots with a webcam:
I can then use afine transform to unwarp the image (an adaptation of this example):
def unwarp_image(img):
rows,cols = img.shape[:2]
# Source points
left_top = 12
left_bottom = left_top+2
top_left = 24
top_right = 13
bottom = 47
right = 180
srcTri = np.array([(left_top,top_left),(right,top_right),(left_bottom,bottom)], np.float32)
# Corresponding Destination Points. Remember, both sets are of float32 type
dst_height=30
dstTri = np.array([(0,0),(cols-1,0),(0,dst_height)],np.float32)
# Affine Transformation
warp_mat = cv2.getAffineTransform(srcTri,dstTri) # Generating affine transform matrix of size 2x3
dst = cv2.warpAffine(img,warp_mat,(cols,dst_height)) # Now transform the image, notice dst_size=(cols,rows), not (rows,cols)
#cv2.imshow("crop_img", dst)
#cv2.waitKey(0)
return dst
..which gives me an image something like this:
I still need to extract the text using some sort of OCR routine but first I'd like to automate the part that identifies what pixel locations to apply the affine transform to. So if someone knocks the webcam it doesn't stop the software working.
We can do this via the following formula: Assume a window or image with a given WIDTH and HEIGHT. We then know the pixel array has a total number of elements equaling WIDTH * HEIGHT. For any given X, Y point in the window, the location in our 1 dimensional pixel array is: LOCATION = X + Y*WIDTH.
The location of a pixel, or array element, in the image is uniquely defined by its coordinates. THere are two ways to speciy the pixels coordinates: linear offset and axis coordinates. The linear offset is a sequential number ranging from 1 to the total number of pixels in the image.
In digital imaging, a pixel (abbreviated px), pel, or picture element is the smallest addressable element in a raster image, or the smallest addressable element in an all points addressable display device; so it is the smallest controllable element of a picture represented on the screen.
Since your image is pretty much planar, you can look into finding the homography between the image you get from the webcam and the desired image (in the upright position).
Edit: This will rotate the image in the upright position. Once you've registered your image (brought it in the upright position), you could do row-wise or column-wise projections (sum all the pixels along the columns to get one vector, sum all the pixels along the rows to get one vector). You can use these vectors to figure out where you have a jump in color, and crop it there.
Alternatively you can use the Hough transform, which gives you lines in an image. You can probably get away with not registering the image if you do this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With