Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Connected Character segmentation in OpenCV

What is a good method to segment characters that are united as in the following figure, knowing that:

  • characters have this font, but the font size varies based on the image size
  • only isolated groups of characters from the image are connected

enter image description here

Also, how can i detect if in a given bounding box, there are 2 or more letters which are connected?

I tried with checking for width > height for detecting connected characters but it doesn't work for the blue groups in the image.

I also tried a segmentation method based on: Article section 3.4 for separating characters but got poor results.

like image 592
Ravul Avatar asked Nov 25 '13 13:11

Ravul


1 Answers

IDEA: if you have a good ocr already, you can try to apply ocr all these connected components (or contours). If ocr cant detect a letter; than there is not 1 letter, there are 2 or more.

IDEA: check convexity defects of these connected components, the closest defect points are where the bridges are.

IDEA: use a kernel having small width & big height for erosion+dilation (morphological opening)

IDEA: take y-derivative of the image. The smallest contours (or lines) left will be your bridges. Mark them and erase those pixels from the original image.

IDEA: search problem approach: Take 2 letters from alphabet (and this font), connect them horizontally with some tool and use matchShapes method (moment match) of opencv to find if that shape matches with your connected component. Or try to implement autocorrelation.

good luck.

like image 106
baci Avatar answered Nov 12 '22 12:11

baci