Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove noisy lines from an image

I have images that are noised with some random lines like the following one:
enter image description here
I want to apply on them some preprocessing in order to remove the unwanted noise ( the lines that distort the writing) so that I can use them with OCR (Tesseract).
The idea that came to my mind is to use dilation to remove the noise then use erosion to fix the missing parts of the writing in a second step.
For that, I used this code:

import cv2
import numpy as np

img = cv2.imread('linee.png', cv2.IMREAD_GRAYSCALE)
kernel = np.ones((5, 5), np.uint8)
img = cv2.dilate(img, kernel, iterations=1)
img = cv2.erode(img, kernel, iterations=1)
cv2.imwrite('delatedtest.png', img)

Unfortunately, the dilation didn't work well, The noise lines are still existing.

enter image description here
I tried changing the kernel shape, but it got worse: the writing were partially or completely deleted.
I also found an answer saying that it is possible to remove the lines by

turning all black pixels with two or less adjacent black pixels to white.

That seems a bit complicated for me since I am beginner to computer vision and opencv.
Any help would be appreciated, thank you.

like image 311
test Avatar asked Jan 03 '19 19:01

test


People also ask

How do you remove noise from image?

Filtering image data is a standard process used in almost every image processing system. Filters are used for this purpose. They remove noise from images by preserving the details of the same. The choice of filter depends on the filter behaviour and type of data.

Which is mostly used to reduce the noise content in an image?

It can be mostly eliminated by using dark frame subtraction, median filtering, combined median and mean filtering and interpolating around dark/bright pixels.

How do I remove Gaussian noise from a picture?

Removing Gaussian noise involves smoothing the inside distinct region of an image. For this classical linear filters such as the Gaussian filter reduces noise efficiently but blur the edges significantly.


1 Answers

Detecting lines like these is what the path opening was invented for. DIPlib has an implementation (disclosure: I implemented it there). As an alternative, you can try using the implementation by the authors of the paper that I linked above. That implementation does not have the "constrained" mode that I use below.

Here is a quick demo for how you can use it:

import diplib as dip
import matplotlib.pyplot as pp

img = 1 - pp.imread('/home/cris/tmp/DWRTF.png')
lines = dip.PathOpening(img, length=300, mode={'constrained'})

Here we first inverted the image because that makes other things later easier. If not inverting, use a path closing instead. The lines image:

lines

Next we subtract the lines. A small area opening removes the few isolated pixels of the line that were filtered out by the path opening:

text = img - lines
text = dip.AreaOpening(text, filterSize=5)

text

However, we've now made gaps in the text. Filling these up is not trivial. Here is a quick-and-dirty attempt, which you can use as a starting point:

lines = lines > 0.5
text = text > 0.5
lines -= dip.BinaryPropagation(text, lines, connectivity=-1, iterations=3)
img[lines] = 0

final result

like image 71
Cris Luengo Avatar answered Sep 19 '22 17:09

Cris Luengo