Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Text Detection: Getting Bounding boxes

I have a black and white image, of a text document. I want to be able to get a list of the bounding boxes for each character. I have attempted an algorithm myself, but it takes excessively long, and is only somewhat successful. Are there any python libraries I can use to find the bounding box? I've been looking into opencv but the documentation is hard to follow. And in this tutorial I can't even decipher whether the bounding boxes were found because I can't easily find what the functions actually do.

like image 767
Matthew Ciaramitaro Avatar asked Apr 24 '18 11:04

Matthew Ciaramitaro


1 Answers

You can use boundingRect(). Make sure your image background is black and text in image is white.Using this code you can draw rectangles around text in your image. To get a list of every rectangle please add respective code segment as per your requirement.

import cv2
img = cv2.imread('input.png', 0) 
cv2.threshold(img,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU,img)

image, contours, hier = cv2.findContours(img, cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE)
for c in contours:
    # get the bounding rect
    x, y, w, h = cv2.boundingRect(c)
    # draw a white rectangle to visualize the bounding rect
    cv2.rectangle(img, (x, y), (x + w, y + h), 255, 1)

cv2.drawContours(img, contours, -1, (255, 255, 0), 1)

cv2.imwrite("output.png",img)
like image 199
Ishara Madhawa Avatar answered Oct 01 '22 07:10

Ishara Madhawa