I am trying to detect text from an input image using openCV. For that I need to remove the noise components from the image. The criteria for that which am using is that if the pixel count of certain component is less than 15 am eliminating that particular component.
e.g suppose the below given images are provided as i/p to the function:
As it can be seen both the images contain a lot of unwanted noisy pixels, specially the first one.
So if anyone can suggest a feasible way for achieving it, it would be highly appreciated.
Ok, sorry but this isn't in c
and its not using opencv
, however I'm sure labelling must be possible in opencv
, just I haven't used it yet... so this might help... Basically the idea is:
Here I implement this in python
using scipy
, but just for size (not shape, although this is easy and would get rid of the long thin lines in first image below). For this to work we must know an acceptable range of sizes for the letters - however you could determine this after labelling by looking at average blob size.. You may still get letter sized false positives - but these could possibly be removed by observing that they fall outside a certain area of concentrated blobs (as text is spacially regular)... Also minimum sentence length could be a powerful constraint.
Anyhow, code:
import scipy
from scipy import ndimage
im = scipy.misc.imread('learning2.png',flatten=1)
#threshold image, so its binary, then invert (`label` needs this):
im[im>100]=255
im[im<=100]=0
im = 255 - im
#label the image:
blobs, number_of_blobs = ndimage.label(im)
#remove all labelled blobs that are outside of our size constraints:
for i in xrange(number_of_blobs):
if blobs[blobs==i].size < 40 or blobs[blobs==i].size>150:
im[blobs==i] = 0
scipy.misc.imsave('out.png', im)
results:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With